Mandatory Dataset
14 min
the following core dataset files are mandatory for building the yieldigo instance and must be provided in the structure described below each table indicates the column name in the file, the data type, and a description primary keys (unique values) are highlighted in bold mandatory dataset zip https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/y79vanunzq7ogkyj/mandatory dataset zip categories to enable accurate product categorization, a complete list of all product categories must be provided, including both the category names and their hierarchical position within the category tree only one category structure can be implemented — either a tree or a horizontal structure if multiple category structures exist on the client’s side, the client must designate one primary structure to be used for product classification true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type categories csv https //d3jwyy9rhyhl55 cloudfront net/v2/56396/contents/ftcbefasyxbne3tc/categories csv data checks all rows with an empty parent id represent top level categories categories must form a tree like structure, i e an acyclic graph a category tree typically consists of 3–6 levels the category hierarchy must be unambiguous — if multiple structures exist, a single primary structure must be chosen each category must contain at least some products other options for export an alternative approach to the export of categories is the horizontal structure horizontal categories csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/hik5r0f0zdpxmlcj/horizontal categories csv products a complete list of all products included in the analysis must be provided this dataset should contain all essential information for each product, including the product name, description, and any other relevant metadata multipacks if each multipack's pricing is independent of its products, then each multipack should have its unique identifier however, if the pricing of a multipack depends on the products it contains, then multipacks should not be included in the product table true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type optional columns true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type products csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/gmhytva0d9gzhvhu/products numbers data checks each leaf category id must reference a leaf node — i e the last and most detailed level of the category tree active products are defined as items that have been sold in the recent past or are newly listed unit size represents the expected measurement unit within a given category for example, do not mix grams and kilograms in case of discrepancies, all values should be converted to kilograms the prices file should contain entries for the majority of active products other options for export products, brands, and suppliers may be combined into a single table, but only if each product has exactly one brand and one supplier assigned if a product can be associated with multiple brands or suppliers, this information should be provided in separate files (see sections brands and suppliers) possible modifications status – a binary field (true/false) can be used custom data – a column for any additional information the client would like to display on the reporting page product brands suppliers csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/j4eqprl2ldkcnync/product brands suppliers 2 numbers stores this table is mandatory and may only be omitted if a single price applies to the entire market true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type optional columns true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type stores csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/osbs4gpnbhkq8wwt/stores csv zones a pricing zone is defined as a group of stores that share the same regular sale price this table is mandatory and may only be omitted if there is a single pricing zone, i e all stores use the same price list true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type all stores within a single zone must share the same price (or be intended to share it in the future) each existing store must be assigned to exactly one zone zones csv https //d3jwyy9rhyhl55 cloudfront net/v2/56396/contents/0tmvlvxpfdbecp8r/zones csv data checks store id must not contain any whitespace characters each store must be assigned to a zone other options for export export zone id and zone name as additional columns in the stores file stores zones csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/q5upvuasafx0spo7/stores zones csv export zone id as a new column in the stores file the zones file must contain both zone id and zone name stores zones limited csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/uaw5yens2dt7lywv/stores zones limited csv zones limited csv https //d3jwyy9rhyhl55 cloudfront net/v2/56396/contents/ls8ysxxyxjv5jnow/zones limited csv actual prices provide the current valid regular prices for each store and product do you maintain multiple “layers” of price lists? for example, can promotional and regular prices be valid at the same time? please provide only regular prices in the data you send what is your primary data source for margin calculation? is it the purchase price, the cost price, or the stock price? it is important to keep this choice consistent over time how is the cost price calculated? do you apply retroactive adjustments (e g back bonuses)? can promotional and regular purchase prices be distinguished? if yes, please compute the cost price using only the regular purchase prices all sale prices must exclude promotional prices true 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type prices csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/ccuiuudl8uwzockb/prices csv data checks the majority of active products must have an entry in the prices file all values in the zone id column must exist in the zones file prices must not include promotional prices the majority of products should have a positive margin other options for export export is promo price as a new boolean column in the prices file this column enables the export of actual promotional prices it should be used in cases where the regular price is not available during promotions, and the cost price may vary significantly prices with promo flag csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/6njcyn6mmhiguvry/prices with promo flag csv sales the sales files must contain all sales transactions ideally, they should cover three years of sales history , including daily increments for the most recent days for the first iteration , please provide two months of sales data only this smaller dataset will be sufficient to verify the format and identify potential data issues for sales data, we distinguish between two types historical data – exported as monthly files in the format sales yyyymm csv (e g , sales 201901 csv, sales 201902 csv) incremental data (daily/weekly) – exported as files in the format sales yyyymm csv (monthly) or sales yyyymmdd csv (daily / starting day of the week) deletion of records – clarify whether sales records can disappear (e g , visible one day, missing the next) consistency of records is required full day data – the system requires complete sales data for each day it is preferable to provide data that is two days old but complete, rather than incomplete data from the previous day state of orders (e shops) – define which orders should be treated as final sales, ensuring they are stable and unlikely to be canceled vouchers, discounts, and promotions – specify which activities affect the regular sale price and how these can be identified in the data returns – returns must not be included in the dataset 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type optional columns 145,145,145,146 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type sales csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/jk4lybcka5kly71g/sales data checks price and shelf price must always include vat (if vat is applied) cost price must always exclude vat no duplicate records are allowed in the file promo type id should contain only values from a predefined, limited set of options zero values in the quantity, price, or cost price columns must be checked for validity transactions with significantly negative margins must be reviewed — they are typically caused by promotions or data errors other options for export export aggregated by day/store/product/price type/promo type within a single day, multiple rows may exist — one for regular sales and additional rows for each price type id or promo type id ⚠️ this option is not recommended, as it restricts the availability of certain product features sales aggregated csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/lh1hnczpbgoal53a/sales aggregated csv cross check please provide aggregated transaction data once the sales file structure has been confirmed and valid transaction history has been delivered this is a one time export table used to verify data quality after the sales history is provided if inconsistencies are found between this manually exported data and the data exported automatically, further investigation will be required to identify and resolve the discrepancies the purpose of this table is to validate automatically exported data and cross check sold quantities, revenues, and margins aggregation must be performed at the top category level at least 12 months of history must be provided 193,193,195 unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type unhandled content type cross check csv https //d3jwyy9rhyhl55 cloudfront net/v2/4158/contents/zfadoepvigbuvovf/cross check csv data checks total quantities, revenue, and profit should correspond to expectations total quantities contain either the total number of products or weighted goods that are not counted in promo share should correspond to expectations
