blended data step 1 conceptual input


#1

Summary
well, maybe (most probably) i not understand the concept between the two step aggregation of blended data but for me it feels like step 1 (Add your data tables) is unnecessary. why not just choose two full tables? when i want to aggregate lots of fields it’s quite tricky to bring all fields in the “card editor” window together and a “take over all fields” is also missing. so maybe i missed out the concept or i just think it would be easier to 1. choose two or more tables 2. if necessary to deselect unwanted columns and 3. definine sql statement.

again – i am not sure if there is a deeper concept behind i not understand. but i not really understand the sense of the “detour” with metrics and categories and sorting the “card editing” like step 1

Steps to reproduce
n/a

Expected results
n/a

Actual results
n/a

Screenshot

Suggestion for fix
n/a


#2

Thanks for the feedback, Frank! I do agree that dragging lots of fields is a quite annoying thing.

Just to share some background about the design concerns here. This version of “Blend data” feature is a generic solution, and it’s designed to work with all kinds of data sources - SaaS APIs, files and database tables. For SaaS data sources (e.g., Google Analytics), very often it’s not possible to add “all” fields because their API doesn’t allow querying with arbitrary combination of fields. So the first reason for the existence of step 1 is to make sure that a valid data table is built before the “blend” step.

The other reason for that step is for allowing to add additional filtering, time range scoping and additional grouping so that one can reduce the amount of data that reaches the “blend” step. Transferring a large amount of data into Datadeck from external sources can take quite long time, and this filtering capability is considered as a way to mitigate that.

That said, I totally agree we should provide some convenience for scenarios with files and database tables as sources of blending. Maybe a “add all” button as you suggested or a batch selection dialog box would help. We’ll put more thoughts into it :grin:.


#3

what a great feedback. thank you! i see all your points! looking forward to see the little changes that help to aggregate a larger number of fields :slight_smile: :+1:t3:


#4

Thanks Frank. We’re always hungry for more feedback!


#5

…ok here is some more :wink:. i am struggling over the meaning and the sense of “metric” and “category” in step 1. maybe also the sorting which could only helpl for viewing. can you help me understand? :slightly_smiling_face:


#6

The “metric” and “category” concepts are actually inherited from the card builder. It’s pretty much the same as building a card with table chart.

Category can be understood as “group by” - if you drag one single field into Category, each unique value of that field will occupy a row; if you drag multiple fields, each combination of unique values of those fields will occupy a row. Metric are aggregated values in the groups generated by Category - if you use SQL or Excel/CSV as your source tables in Blend, you get to choose the aggregation method (sum, average, etc.). The basic idea is just to summarize all numeric values within a group into one single representation.

With a concrete example: if you have order items in your table, with each row containing Country, Segment and Sales, if you put Country and Segment into Category, and Average(Sales) into Metric, you’ll get average sales value for each combination of Country and Segment. The entire thing is very similar to the “SELECT … FROM … GROUP BY …” thing in SQL.

Sorting in the source table builder is more than for viewing when you take the 100K rows limit into consideration. You can use sorting to control what’re included in the top 100K rows.