Web
The web module provides tools for interacting with web content. These tools allow you to fetch and process web pages in various formats.
Available tools
Web-fetch page
Fetches content from a web page and returns it in a clean, readable format with preserved structure.
Parameters
-
url
(string, required)- The URL to fetch
- Must be a valid HTTP or HTTPS URL
-
format
(string, optional)- Output format for the page content
- Options:
text
(default): Returns clean, readable text with preserved structurehtml
: Returns raw HTML content
- Use
text
for most cases when you want to read or analyze the content - Only use
html
when you need to process the raw HTML structure
-
headers
(object, optional)- Custom headers to send with the request
- Useful for:
- Authentication
- Setting user agent
- Custom request headers
Example Usage
from tyler.models import Agent, Thread, Message
# Create an agent with web tools
agent = Agent(
model_name="gpt-4o",
purpose="To help with web content",
tools=["web"]
)
# Create a thread with a request to fetch a web page
thread = Thread()
message = Message(
role="user",
content="Can you fetch and summarize the content from https://example.com?"
)
thread.add_message(message)
# Process the thread - agent will use web-fetch_page tool
processed_thread, new_messages = await agent.go(thread)
Response Format
The tool returns a dictionary with:
content
: The fetched content in the requested formatmetadata
: Additional information about the pagetitle
: Page title if availableurl
: Final URL (after any redirects)status
: HTTP status codeheaders
: Response headers
Error Handling
The tool handles common errors:
- Invalid URLs
- Network timeouts
- HTTP errors (4xx, 5xx)
- Invalid content types
Error responses include:
- Error message
- Error type
- HTTP status code (if applicable)
Best practices
-
Use Text Format by Default
- The
text
format is optimized for readability - Preserves important structure while removing clutter
- Better for content analysis and summarization
- The
-
Handle Large Pages
- Consider using pagination for large content
- Process content in chunks if needed
- Be aware of rate limiting and robots.txt
-
Respect Website Policies
- Check robots.txt
- Use appropriate delays between requests
- Include proper user agent headers
-
Security Considerations
- Only fetch from trusted sources
- Be cautious with user-provided URLs
- Validate and sanitize content
Configuration
No special configuration or environment variables are required for web tools.
Common use cases
-
Content Extraction
- Fetch articles for summarization
- Extract specific information from web pages
- Gather data for analysis
-
Web Scraping
- Collect structured data from websites
- Monitor page changes
- Archive content
-
Site Analysis
- Check page availability
- Analyze page structure
- Validate links
-
Content Integration
- Import content from external sources
- Aggregate information
- Cross-reference data