@mattab opened this issue on November 7th 2014

The goal of this issue is to implement a new feature in Piwik that will let you import a custom data file (eg. CSV format) that contains extra information about either your users, your content and your products. When importing this data you tell Piwik how to match it to your people and content and then Piwik will automatically let you use the Custom Data for Segments, or display it in the UI (eg. Visitor Log), and more!

This feature is similar to Google Analytics Custom Data Import feature: https://support.google.com/analytics/answer/3191417 (which used to be called Dimension Widening)

Types of data you can import 1. User data (log_visit) — import user metadata, such as a loyalty rating or lifetime customer value, and use these values with segmentation. 2. Content data (log_action) — group content by importing content metadata, such as author, date published, and article category. 3. Product Data (log_conversion_item) — gain better merchandising insights by importing product metadata, such as size, color, style, or other product-related dimensions. 4. Campaign Data — use campaign tracking IDs and import ad campaign-related dimensions, such as source, medium.

Inspired from GA because their way just makes sense.


I prepare a data file in CSV format. - The file contains a dimension known to Piwik (eg. userId) and then the file contains up to N custom data columns: - the file first row contains the column names. - One of the columns must be the column known to Piwik. eg. userId. - The Piwik dimension must be one of dimensions as found in the Segmentation reference - for example when using userId column in CSV file then it must match the User ID values set in Piwik.

Here is a file example:

userId, LifeTimeValue, Industry
clientid1, 500, IT
clientid2, 5, NGO
clientid5, 100, Farming
client765, 222, IT

Then as a Piwik Admin user or Super User: - I click to Settings > Custom Data - I select a website to import the custom data for - I select User as "Type of data to import" (I could also instead import Content or Product or Campaign custom data) - I give a name to my custom import eg "Customer LTV & Industry" - I select the file to upload and upload it (eg. in tmp/customdata) - After upload Piwik detects the columns - If a known Piwik dimension is not found in the columns, then display error message that a piwik dimension is required - Detect the Custom Data dimensions from the file - Ask user whether the custom data dimension(s) should be imported - Schedule the import as background task

Proposed steps / TBD

DB Schema: - new table piwik_custom_data keeps track of the custom data as simple lookup table - idimport, piwik_dimension_value, custom_dimension_value - eg. 1, client765, 222 - storing the idimport lets us easily cleanup a given Custom Data Import - new table piwik_custom_data_dimensions - idimport, idsite, piwik_dimension_name, custom_dimension_name - eg. 1, 5, userId, LifeTimeValue - if the imported file contains several custom dimensions then several rows are created in the table, one for each dimension - new table piwik_custom_data_import - idimport, idsite, ts_imported, name, rows, login, status, deleted, filename - example entry for our test custom data import of 9 million rows: - 1, 2014-12-01 00:11:22, "Customer LTV & Industry", 9000000, matt, "pending", 0 - Status could be: pending import, importing, imported

Notes - Build this feature in the API as well, so that we can do the import via UI or API (or later even console command) - New UI to upload custom data sets, New UI to view the existing data sets and their status - Make the custom data dimensions available as Segments - To make visualising the custom dimensions easy, display all custom data dimensions in the Visitor Log and Visitor Profile. - The custom data could also be used by plugins and /or displayed in new original ways!

@mattab commented on November 6th 2015

