Skip to content

Product Schema

Entity Type: Product
Purpose: Represents a product with associated items, resources, and relationships
Format: JSONL (one product per line)
Sample Files: ecommerce_629037_20260413221143_001.jsonl, gemaire/ecommerce_20260415122159450_001.jsonl

Structure

Active Product

json
{
  "EntityId": number,
  "EntityTypeId": "Product",
  "FieldSetId": null,
  "Attributes": { ... },
  "Items": [ ... ],
  "Resources": [ ... ],
  "InboundLinks": [ ... ],
  "OutboundLinks": [ ... ]
}

Deleted Product

json
{
  "EntityId": number,
  "EntityTypeId": "Product",
  "Deleted": true
}

When Deleted: true, all other fields (Attributes, Items, Resources, InboundLinks, OutboundLinks) are null/empty. ETL should remove these products from the target system.

Top-Level Properties

  • EntityId (number) - Unique product identifier
  • EntityTypeId (string) - Always "Product"
  • FieldSetId (null) - Optional fieldset reference
  • Deleted (boolean) - Present and true for deletion records; absent for active products
  • Attributes (object) - Product-level attributes
  • Items (array) - Associated Item entities (SKUs)
  • Resources (array) - Product-level media/documents
  • InboundLinks (array) - Entities linking to this product
  • OutboundLinks (array) - Entities this product links to

Product-Level Attributes

Products have a small set of their own attributes (6-9 populated):

  • ProductId (string) - Product identifier (e.g., "P100982724")
  • ProductShortDescription (localized object) - {"en": "...", "es": "...", "fr-CA": "..."}
  • ProductProductFamily (string) - Product family code (e.g., "PML320")
  • ProductMainSupplier (string) - Primary supplier identifier
  • ProductReportingTaxonomyLevel3 (string) - Reporting taxonomy code
  • ProductWatscoTaxonomy1 (string) - Watsco structure code level 1
  • ProductWatscoTaxonomy2 (string) - Watsco structure code level 2
  • ProductWatscoTaxonomy3 (string) - Watsco structure code level 3

Items Array

Each product contains zero or more Item entities representing SKUs.

  • Single-item products: ~95% of catalog
  • Multi-item products: up to 59 items per product (configurable products with variants)
  • Zero items: only on Deleted: true records
json
{
  "EntityId": number,
  "EntityTypeId": "Item",
  "FieldSetId": null,
  "Attributes": { ... },
  "Resources": [ ... ],
  "InboundLinks": [ ... ],
  "OutboundLinks": [ ... ]
}

Key Item Attributes

Identification:

  • ItemLegacyPIMItemNo - Legacy PIM identifier
  • ItemManufacturerItemNo - Manufacturer part number
  • ItemWatscoItemNo - Watsco internal item number
  • ItemUPCCode - UPC barcode
  • ItemEANCode - EAN barcode
  • ItemGTINCode - GTIN code

Supplier Information:

  • ItemMainSupplier - Primary supplier name
  • ItemManufacturer - Manufacturer name
  • ItemSupplierShortDescriptionEN - Short description
  • ItemSupplierLongDescriptionEN - Long description

Physical Attributes:

  • ItemItemImperialHeightinInches
  • ItemItemImperialLengthinInches
  • ItemItemImperialWidthinInches
  • ItemItemImperialWeightinPounds
  • ItemItemImperialCubicMeasurement
  • ItemItemImperialDimensionalWeight

Packaging:

  • ItemPackagedImperialHeightinInches
  • ItemPackagedImperialLengthinInches
  • ItemPackagedImperialWidthinInches
  • ItemPackagedImperialWeightinPounds
  • ItemItemQuantity
  • ItemItemQuantityType (e.g., "EA")
  • ItemItemPackagedQuantity
  • ItemItemMastCartonQuantity

Inventory & Status:

  • ItemOnHandQtyInAllWatscoSubs - Total on-hand quantity
  • ItemSupplierDiscontinuedFlag - Discontinuation status
  • ItemCreatedOn - Creation timestamp
  • ItemLastChangedon - Last modified timestamp

Business Flags:

  • ItemHydrosItem - Hydros system flag
  • ItemSupplySyncItem - Supply sync flag
  • ItemRebateItem - Rebate eligibility
  • ItemRebateCategory - Rebate category
  • ItemOEMPart - OEM part flag

Origin:

  • ItemCountryofOrigin - Array of country codes (e.g., ["USA"])

Resources

Both Product and Item entities can have associated resources:

json
{
  "ResourceURL": "https://watsco-pim.s3.us-east-1.amazonaws.com/InRiver/SANDBOX/ECOM/assets/...",
  "Attributes": {
    "ResourceFileId": number,
    "ResourceFilename": "string",
    "ResourceTitle": "string",
    "ResourcePublishedStatus": "NotPublished",
    "ResourceResourceType": "normal|ai1|ai2|ai3|ai4|logo|ibw",
    "ResourceImportProcessed": boolean,
    "ResourceMimeType": "jpeg|png|pdf",
    "ResourceContentHash": "SHA-256 hash string"
  }
}

Resource Types

  • normal - Standard product image (most common - 89%)
  • ai1-ai4 - Additional images (11%)
  • logo - Brand/manufacturer logo
  • ibw - Unknown type
  • cl - Cutsheet/catalog (legacy format only)
  • subs - Submittal sheet (legacy format only)

CDN Migration

Resources have moved from Aperture CDN to Watsco S3:

  • Legacy: https://asset.productmarketingcloud.com/api/assetstorage/...
  • Current: https://watsco-pim.s3.us-east-1.amazonaws.com/InRiver/SANDBOX/ECOM/assets/...

Resource Coverage

  • ~81% of products have at least one resource
  • Products range from 0 to 76 resources
  • All current resources are images (jpeg/png); no PDFs in JSONL samples

Category assignments via ChannelNodeProduct link type. All active products have at least one.

json
{
  "LinkTypeId": "ChannelNodeProduct",
  "SourceEntityTypeId": "ChannelNode",
  "SourceEntityId": number
}

Always empty in observed samples.

Items have their own InboundLinks and OutboundLinks for product relationships:

json
{
  "LinkTypeId": "ItemAssociatedItem|ItemSubstituteItem|ItemUpsellingItem",
  "TargetEntityTypeId": "Item",
  "TargetEntityId": number
}

Link types observed:

  • ItemAssociatedItem - Related/associated products (most common)
  • ItemSubstituteItem - Substitute/replacement products
  • ItemUpsellingItem - Upselling recommendations

JSONL File Naming Conventions

Two naming patterns observed:

  1. With channel ID: ecommerce_{channelId}_{YYYYMMDDHHmmss}_{seq}.jsonl

    • Example: ecommerce_629037_20260413221143_001.jsonl
  2. Without channel ID: ecommerce_{YYYYMMDDHHMMSSmmm}_{seq}.jsonl

    • Example: ecommerce_20260415122159450_001.jsonl (Gemaire BU)

Sequence numbers (_001, _002) indicate file chunking for large exports.

Example Hierarchy

Product 648070 (Active)
├── Attributes: ProductId, ProductShortDescription, ProductFamily, Taxonomy
├── InboundLinks: ChannelNodeProduct → ChannelNode 50714
├── Item 432829 - "PML320 4\" Brown Louvered Hood, Ring"
│   ├── OutboundLinks: ItemAssociatedItem → Item 431717
│   └── Resources: [normal.jpg]
├── Item 432838 - "PML320 4\" White Louvered Hood, 11\" Tail"
├── Item 432845 - "PML320 4\" White Louvered Hood, Ring"
├── Item 432840 - "PML320 4\" Brown Louvered Hood, 11\" Tail"
└── Item 442896 - "PML320 6\" White Louvered Hood, Ring"

Product 647401 (Deleted)
└── {"EntityTypeId": "Product", "EntityId": 647401, "Deleted": true}

Usage

Products are the primary entity for e-commerce integration:

  • Contains all SKU-level data via Items array
  • Links to media assets via Resources
  • Connects to categories via ChannelNode relationships
  • Provides complete product information for display and ordering