query · General · Mesrai Skills

Prerequisites

Queries run against the workspace and branch set for the current session. Use get_session_workspace_and_branch to check the current context. If no workspace/branch is set, use list_workspaces, list_workspace_branches, and set_session_workspace_and_branch to select one. See the model-exploration skill for the full workspace/branch tool reference.

Overview

Honeydew provides three ways to query data through the semantic layer. Each method suits a different situation — pick the right one based on how well you understand the model and how complex the question is.

Method	Tool	Best For
Structured query	`get_data_from_fields` / `get_sql_from_fields`	You know the exact fields. Deterministic, full control.
Deep analysis	`ask_deep_analysis_question`	Any natural language question — simple or complex, "why", multi-step, agentic.

When to Use Each Method

1. Structured Query (`get_data_from_fields` / `get_sql_from_fields`)

Use when:

You know the entity, attribute, and metric names (or can discover them via list_entities / get_entity)
You need precise control over filters, ordering, and field selection
You want deterministic, reproducible results
You need to validate a newly created metric or attribute
The user specifies exact fields like "show me detailed_listings.price by detailed_listings.room_type"

Do NOT use when:

The question requires multi-step reasoning or investigation

How it works:

get_data_from_fields — executes the query and returns data rows
get_sql_from_fields — returns the generated SQL without executing (useful for review, debugging, or handing off to other tools)

Both take the same field parameters.

2. Deep Analysis (`ask_deep_analysis_question`)

Use when:

The user asks a question in plain English and you don't know the exact field names
The user wants a quick answer without worrying about model details
The question requires multiple steps or investigative reasoning
The user asks "why" something happened (e.g., "why did revenue drop in Q3?")
The user wants trend analysis, anomaly detection, or root cause investigation
The question is open-ended and may require looking at the data from multiple angles
Follow-up questions build on prior analysis (use conversation_id)

Decision Flow

User asks a data question
    │
    ├─► Do you know the exact field names?
    │       │
    │       ├─► YES → get_data_from_fields (structured, deterministic)
    │       │         (or get_sql_from_fields to preview SQL without executing)
    │       │
    │       └─► NO → ask_deep_analysis_question (plain English, any complexity)
    │
    └─► Plain English question / investigation / "why" / trends?
            └─► ask_deep_analysis_question

Method 1: Structured Query

Field Parameters

A structured query uses flat field parameters to define what data to retrieve:

attributes — dimensions to group by (columns in the output), e.g. ["entity.attribute_name"]
metrics — aggregated measures (SUM, COUNT, AVG, etc.), e.g. ["entity.metric_name"]
filters — row-level filters applied before aggregation, e.g. ["entity.field = 'value'"]
order_by — sort order for results. Each entry MUST be a quoted string, as if it were a SQL identifier — e.g. ["\"entity.field\" ASC"]. Always wrap the field reference in double quotes inside the string.
domain — optional domain name for query context
limit — max rows to return (default: 100)
offset — rows to skip (for pagination)

All fields use entity.field_name syntax. Cross-entity fields are supported when relations exist.

Discovering Fields

Before building a query, discover the available fields:

list_entities — see all entities
get_entity with entity name — see its attributes, metrics, and relations
get_field with entity and field name — get detailed info about a specific field
list_domains — see all available domains (useful before passing domain parameter)
search_model with a keyword and search_mode (OR for broad discovery, EXACT for known names) — find fields across the model

Examples

Simple metric query — total count:

Call get_data_from_fields with:

metrics: ["detailed_listings.count"]

Dimension breakdown — listings by room type:

Call get_data_from_fields with:

attributes: ["detailed_listings.room_type"]
metrics: ["detailed_listings.count"]
order_by: ["\"detailed_listings.count\" DESC"]

Filtered query — only entire homes:

Call get_data_from_fields with:

attributes: ["detailed_listings.neighbourhood_cleansed"]
metrics: ["detailed_listings.count"]
filters: ["detailed_listings.room_type = 'Entire home/apt'"]
order_by: ["\"detailed_listings.count\" DESC"]

Cross-entity query — listings with host info:

Call get_data_from_fields with:

attributes: ["detailed_listings.room_type", "dim_host.host_is_superhost"]
metrics: ["detailed_listings.count"]
order_by: ["\"detailed_listings.count\" DESC"]

Using aliases — rename fields or ad-hoc expressions:

You can alias any field or ad-hoc expression using AS "alias_name". This controls the column name in the output.

Call get_data_from_fields with:

attributes: ["detailed_listings.room_type"]
metrics: ["detailed_listings.count AS \"total_listings\"", "AVG(detailed_listings.price) AS \"avg_price\""]
order_by: ["\"total_listings\" DESC"]

Once aliased, use the alias (not the original expression) in order_by.

Pagination — large result sets:

Call get_data_from_fields with:

attributes: (your fields)
metrics: (your metrics)
limit: 50 (max rows to return)
offset: 100 (skip first 100 rows)

Finding duplicate values:

Call get_data_from_fields with:

attributes: ["detailed_listings.host_name"]
metrics: ["COUNT(detailed_listings.host_name)"]
filters: ["COUNT(detailed_listings.host_name) > 1"]
order_by: ["\"COUNT(detailed_listings.host_name)\" DESC"]

This groups by the attribute, counts occurrences, and filters to only rows that appear more than once — surfacing duplicates.

SQL preview only:

Call get_sql_from_fields with the same field parameters to see the generated SQL without executing.

Filter Syntax

Filters use standard comparison expressions: =, >, <, IN (...), ILIKE, SEARCH(...), IS NULL, booleans, date ranges, and AND/OR combinations.

For the complete filter expression reference — including SEARCH, date handling, and type casting — see the filtering skill.

Method 2: Deep Analysis

ask_deep_analysis_question

Call with:

question (required): the analysis question
agent (optional): agent name to use as analysis context — use list_agents to discover available agents and their associated domains
conversation_id (optional): ID from a previous deep analysis call, for follow-up questions

question: "Analyze the relationship between host response time and review scores. Are there significant patterns?"
agent: "my_agent"

Returns:

Markdown analysis report with findings
Supporting data
Suggested follow-up questions
conversation_id for continuing the conversation

After Deep Analysis: Display the UI Link

After a successful ask_deep_analysis_question call, the response includes a ui_url field. Always display this URL to the user so they can view the full analysis in the Honeydew application.

Follow-up Questions

Use conversation_id from the previous response to ask follow-up questions that build on the prior analysis:

question: "Now break this down by room type — does the pattern hold across all types?"
agent: "my_agent"
conversation_id: "<id from previous response>"

Example Questions

Simple natural language:

"What are the top 10 neighbourhoods by number of listings?"
"Show me average price by room type."

Complex analysis:

"Why did the average review score drop for listings in Brooklyn?"
"What factors most influence listing price? Analyze the key drivers."
"Compare superhost vs non-superhost performance across all metrics."
"Identify unusual patterns in listing availability over the past year."
"What are the characteristics of top-performing listings?"

Combining Methods

For complex tasks, combine methods in sequence:

Discover — Use list_entities / get_entity to understand the model
Query — Use get_data_from_fields for precise, targeted queries
Investigate — Use ask_deep_analysis_question for root cause or trend analysis

Example Workflow

User: "Help me understand pricing patterns for Airbnb listings."

Discover entities: list_entities → find detailed_listings
Explore fields: get_entity for detailed_listings → find price, room_type, neighbourhood_cleansed
Targeted query: get_data_from_fields → price distribution by neighbourhood for Entire homes only
Deep dive: ask_deep_analysis_question → "What factors most influence listing price? Analyze correlations with room type, location, amenities, and reviews."

Documentation Lookup

Use the honeydew-docs MCP tools to search the Honeydew documentation when:

The user asks about query capabilities or features not covered in this skill
You need to understand how the query API interacts with domains, parameters, or governance rules
The user encounters unexpected query behavior and needs deeper context on how the semantic layer resolves queries

Search for topics like: "queries", "perspectives", "dynamic datasets", "parameters", "query API".

Tip: Getting Distinct Values for a Field

To retrieve the distinct (unique) values of a field, include it in attributes and add a count metric in metrics. Use the entity's built-in count metric (e.g., entity.count) if available, or an ad-hoc count metric using COUNT(entity.field) on the field whose distinct values you want — never use COUNT(*). The metric forces aggregation, which groups by the attribute and returns one row per distinct value.

Example — distinct room types:

Call get_data_from_fields with:

attributes: ["detailed_listings.room_type"]
metrics: ["COUNT(detailed_listings.room_type)"]
order_by: ["\"COUNT(detailed_listings.room_type)\" DESC"]

This returns each unique room_type along with its count, ordered by frequency. The count is a useful bonus — it tells you how common each value is — but the key point is that the query returns one row per distinct value.

This pattern is useful for:

Exploring filter values — find out what values exist before writing a filter expression (see the filtering skill)
Validating a new attribute — after creating a calculated attribute, check its distinct output values to confirm the logic is correct (see the attribute-creation skill)
Understanding data distribution — see how data is spread across categories

Best Practices

Start with discovery — always check list_entities / get_entity before building queries, so you reference real fields
Use structured queries for precision — when you know the fields, get_data_from_fields gives you full control and reproducible results
Use deep analysis for insight — when the question is about "why" or requires investigating multiple dimensions
Paginate large results — use limit and offset in get_data_from_fields to avoid overwhelming output
Show SQL when debugging — use get_sql_from_fields to inspect the generated query
Reference fields correctly — always use entity.field_name syntax in field parameters

Prerequisites

Overview

When to Use Each Method

1. Structured Query (get_data_from_fields / get_sql_from_fields)

2. Deep Analysis (ask_deep_analysis_question)

Decision Flow

Method 1: Structured Query

Field Parameters

Discovering Fields

Examples

Filter Syntax

Method 2: Deep Analysis

ask_deep_analysis_question

After Deep Analysis: Display the UI Link

Follow-up Questions

Example Questions

Combining Methods

Example Workflow

Documentation Lookup

Tip: Getting Distinct Values for a Field

Best Practices

1. Structured Query (`get_data_from_fields` / `get_sql_from_fields`)

2. Deep Analysis (`ask_deep_analysis_question`)