Prerequisites
Before creating entities, ensure you are on the correct workspace and branch. Use get_session_workspace_and_branch to check the current session context. For development work, create a branch with create_workspace_branch (the session switches automatically). See the model-exploration skill for the full workspace/branch tool reference.
Overview
A Honeydew entity is the foundational modeling object — a named, governed representation of a business concept at a specific granularity. Every metric and calculated attribute is anchored to an entity. An entity maps to a data warehouse (Snowflake, Databricks, or BigQuery) table, view, custom SQL query, or a virtual derivation from the semantic model.
When creating an entity you are answering three questions:
- What is this? — the business concept (orders, customers, products, locations)
- What makes a row unique? — the granularity key (primary key column(s))
- What columns should be exposed? — the initial attribute mapping from source columns
This skill focuses on the entity shell: source, key, and attribute mapping. Use
attribute-creationto add calculated attributes andrelation-creationto wire up joins afterwards.
Creation Methods
Quick Import: import_tables
Use import_tables to quickly import one or more data warehouse tables into the semantic model.
Each table becomes an entity with its columns automatically mapped as attributes. This is the fastest way to bootstrap entities when you don't need custom YAML.
Parameters:
tables— List of fully qualified table names in the format<database>.<schema>.<table>
Best for: bulk imports, quick prototyping, or when the default column-to-attribute mapping is sufficient.
Primary: create_entity (Recommended)
Unlike metrics and attributes, entity creation always requires YAML — there is no simplified single-expression API. create_entity is the preferred MCP tool because it creates both the entity and its dataset in a single operation.
Parameters:
entity_yaml— YAML defining the entitydataset_yaml— YAML defining the dataset
Required permission: Editor or higher.
Alternative: create_object (Generic)
Use create_object only when creating an entity or dataset independently (e.g. adding a second dataset to an existing entity). Requires a separate call per object.
Parameters:
yaml_text— YAML defining the object
After Creation: Display the UI Link
After a successful create_entity or create_object call, the response includes a ui_url field. Always display this URL to the user so they can quickly open the object in the Honeydew application.
Decision Flow
Need to create an entity?
│
├─► Quick import from warehouse table(s)? Default mapping is fine?
│ └─► Use import_tables ✓ (fastest)
│
├─► New entity + dataset together (most common)?
│ └─► Use create_entity ✓ (full control)
│
├─► Adding a second dataset to existing entity?
│ └─► Use create_object with dataset YAML only
│
└─► Updating an existing entity or dataset?
└─► Use update_object with YAML + object_key
(preserve existing field order — minimal diff)
Examples
See examples.md for full worked examples covering: physical table, custom SQL, virtual entity, time spine, update, and delete.
Discovery Helpers
Use these MCP tools before creating entities:
list_entities— List all entities in the modelget_entity— Get detailed info for a specific entity (attributes, metrics, datasets, relations)search_model— Search for entities, fields, or other objects by name (usesearch_mode: EXACTfor known names,ORfor broad discovery)list_databases— List all databases in the connected data warehouselist_schemas— List schemas in a specific databaselist_tables— List warehouse tables (requiresdatabaseandschemaparameters)get_table_info— Get column details for a specific warehouse table
See reference.md for: YAML schemas (entity + dataset), source types, and granularity key rules.
Documentation Lookup
Use the honeydew-docs MCP tools to search the Honeydew documentation when:
- You need to understand entity source types (physical table, custom SQL, virtual entity) in more depth
- The user asks about granularity design, entity modeling patterns, or best practices
- You need guidance on advanced modeling configurations like time spines, composite keys, or multi-dataset entities
- The user asks about entity governance, ownership, or lifecycle
- The user needs advanced modeling patterns (e.g., slowly changing dimensions, snapshot entities, incremental models)
Search for topics like: "entities", "source types", "granularity", "time spine", "virtual entity", "custom SQL source".
Best Practices
- Always set
ownerto identify the responsible team or person for governance and accountability. - One entity = one granularity. Never mix row-level grains. If a table has order-level and line-item-level rows, model them as two separate entities.
- Name entities after the business concept, not the table.
ordersis better thanfact_ordersorfct_orders_v2. - Always expose foreign key columns as attributes. FK columns (e.g.
customer_idon orders) are required for Honeydew to resolve relations. - Set
timegrainon every date/timestamp source column. Omitting it leads to unexpected aggregation behavior in BI tools. - Prefer absolute table paths (
DB.SCHEMA.TABLE) to avoid ambiguity across environments. - Convention: entity =
orders, dataset =orders_source. Keeping names distinct prevents confusion in YAML.
MANDATORY: Validate After Creating
After creating ANY entity, you MUST invoke the validation skill to test and validate.
See validation skill for:
- How to verify entity exists via
list_entities - How to verify data flows via
get_data_from_fields - Sanity checks (row count, key uniqueness)
- When to alert the user about issues
Quick validation:
- Verify entity exists using
list_entities, filter for the new entity name. - Verify data flows using
get_data_from_fieldswith:
metrics:["<entity>.count"]
Common Pitfalls to Avoid
- Non-unique or nullable keys. Honeydew assumes keys are unique and non-null. Duplicates cause silent incorrect join results. Validate before modeling.
- Using custom SQL when a physical table would do. Custom SQL blocks filter pushdown. Use a domain-level filter instead and keep the entity on the physical table.
- Skipping the key column in the attribute list. The key column must be in the dataset attributes list or Honeydew cannot resolve it.
- Composite keys on virtual entities. Use
HASH()to create a single synthetic key attribute instead. - Forgetting
is_time_spineon your date dimension. Time-aware metrics will not function without a designated time spine entity.