SheetMagicSheetMagic
Function Reference

Web Scraping Functions

Extract content from webpages with VISIT, GETSELECTOR, and GETIMG functions

Overview

SheetMagic provides powerful web scraping functions to extract content from any webpage. These functions handle JavaScript-rendered pages and include intelligent caching for performance.

VISIT

Extract content from any webpage for research and analysis. Returns clean Markdown by default, or raw HTML if specified. Use a CSS selector to extract specific elements from HTML.

Syntax

=VISIT(url, [format], [selector])

Parameters

ParameterTypeRequiredDescription
urlstringYesWebsite URL to extract content from
formatstringNoOutput format: "markdown" (default) or "html" for raw HTML. Case-insensitive.
selectorstringNoCSS selector to extract specific elements. Only works with "html" format.

Returns

Content from the webpage (up to 50,000 characters) in the specified format:

  • Markdown (default): Clean, readable text with Markdown formatting
  • HTML: Raw HTML content from the page
  • HTML with selector: Inner HTML of matching elements (multiple matches separated by newlines)

Examples

Basic usage (returns Markdown):

=VISIT("https://example.com/article")

Get raw HTML:

=VISIT("https://example.com/article", "html")

Extract specific elements with CSS selector:

=VISIT("https://example.com", "html", "div.content > p.intro")

Extract all prices from a page:

=VISIT(A1, "html", ".price")

Extract product titles:

=VISIT(A1, "html", "h2.product-title")

From cell reference:

=VISIT(A1)

From cell reference with HTML output:

=VISIT(A1, "HTML")

Features

  • Handles JavaScript-rendered pages
  • 30-minute user-specific caching
  • Flexible output format (Markdown or HTML)
  • CSS selector support for extracting specific elements
  • Removes navigation, ads, and boilerplate (in Markdown mode)

CSS Selector Examples

SelectorDescription
.classnameElements with specific class
#idnameElement with specific ID
div.productDiv elements with product class
div.content > pDirect child <p> elements of div.content
[data-price]Elements with specific attribute

The selector parameter only works with "html" format. If used with "markdown", the selector is ignored.

Use Cases

  • Content research
  • Competitor analysis
  • Article extraction
  • Documentation scraping
  • Data collection
  • HTML parsing and analysis
  • Price monitoring
  • Targeted element extraction

GETSELECTOR

Get specific elements from a page using a CSS selector.

Syntax

=GETSELECTOR(url, selector)

Parameters

ParameterTypeRequiredDescription
urlstringYesURL to scrape
selectorstringYesCSS selector to match elements

Returns

An array of text content from all matching elements.

Supported Selectors

Selector TypeExampleDescription
Class.classnameElements with specific class
ID#idnameElement with specific ID
Elementdiv, span, pAll elements of that type
Attribute[data-price]Elements with specific attribute
Combineddiv.productDiv elements with product class
Nesteddiv.product h2H2 inside product divs

Examples

Get all prices:

=GETSELECTOR("https://example.com/products", ".price")

Get product titles:

=GETSELECTOR(A1, "h2.product-title")

Get elements by attribute:

=GETSELECTOR(A1, "[data-product-id]")

Get nested elements:

=GETSELECTOR(A1, ".product-card .description")

Use Cases

  • Price monitoring
  • Product data extraction
  • Review scraping
  • Contact information extraction
  • Structured data collection

GETIMG

Extract all images from a webpage.

Syntax

=GETIMG(url)

Parameters

ParameterTypeRequiredDescription
urlstringYesURL to scrape images from

Returns

An array of image URLs (capped at 1,000 images).

Features

  • Handles responsive images (srcset, data-srcset)
  • Lazy loading support (data-src)
  • Automatic relative URL resolution
  • Selects largest image from srcset

Examples

Get all images from a page:

=GETIMG("https://example.com/gallery")

From cell reference:

=GETIMG(A1)

Use Cases

  • Image collection
  • Asset auditing
  • Visual content analysis
  • Media library building

Caching Behavior

All web scraping functions include intelligent caching to improve performance and reduce redundant requests.

Cache TypeDurationDescription
User Cache30 minutesResults cached per user
Script Cache10 minutesShared across all users

To force a fresh fetch, append a unique query parameter to the URL:

=VISIT(A1 & "?refresh=" & NOW())

Best Practices

1. Use Specific Selectors

// Less effective - too broad
=GETSELECTOR(A1, "div")

// More effective - specific class
=GETSELECTOR(A1, "div.product-price")

2. Handle Multiple Results

GETSELECTOR returns an array. Use with other Sheets functions:

// Get first result
=INDEX(GETSELECTOR(A1, ".price"), 1)

// Count results
=COUNTA(GETSELECTOR(A1, ".price"))

3. Combine with AI Functions

// Summarize scraped content
=AITEXT("Summarize: " & VISIT(A1))

// Analyze extracted data
=AITEXT("Analyze these prices: " & JOIN(", ", GETSELECTOR(A1, ".price")))

4. Respect Rate Limits

Avoid scraping too many URLs simultaneously. Space out requests to prevent blocking.


Error Handling

Common errors and solutions:

ErrorCauseSolution
#ERROR!URL not accessibleVerify URL is valid and publicly accessible
#BLOCKED!Site blocking requestsSome sites block automated access
#TIMEOUT!Request timed outTry again later or check URL
#EMPTY!No content foundVerify selector matches elements

Limitations

  • JavaScript-heavy sites: Some dynamic content may not be captured
  • Authentication: Cannot access pages requiring login
  • Rate limiting: Sites may block excessive requests
  • Size limits: Content truncated at 50,000 characters
  • Image limits: Maximum 1,000 images returned