Tableau Introduction- Data source
Day 1
4 May 2017
Tableau
Tableau is an analytics tool which allows us to visualize data in various formats in order to analyze business trends. The current version of Tableau is 10.2.
We can download tableau desktop from www.tableau.com.
The very first page tableau displays is the connect page. It shows the number of technologies/databases tableau can connect to.
Above image shows the types of databases we can connect to.
Excel:
If we connect to Excel file,e.g. Sample- Superstore.xlsx, we will be directed to the next screen called Data source tab. On the Top left corner it shows the excel sheets name as Sample - Superstore. Right below it lists all the worksheets inside that excel sheet. e.g. Orders, People, Returns. Each worksheet is considered as one table in tableau. Following image shows us that Orders table is taken as a source here. Thus all column of orders table will be visible.
Preview data source button:
When we click this , column names, entire data is shown in the table as we generally see in database along with few extra symbols.
Manage Metadata Button:
This will show definition of the orders table. Field name(Name of the field as in Data Warehouse), Data type, remote field name(Name of column as in database).
There is a little downwards arrow on the field name, where we get lot of options like renaming the field name, Hide, Aliases, Create calculated field, Split etc.
Split option:
This option lets us separate a value as shown below.
Similarly, custom split can be used to split the values as shown below:
Data type:
Tableau automatically understands the type of a particular column and assigns a symbol to the field accordingly. In the image below, we can see that Order ID is shown as abc, Order date as Calender symbol, City as Earth symbol etc.
Tableau assigns following symbols to each datatype:
1) String: Abc symbol
2) Date: Calender symbol
3) Geographical column: Earth symbol
4) Numberic: # symbol
there is a small downwards arrow beside these symbols if we click that we will find similar options as we found on Manage metadata side.
We can also sort the data by a field as shown in the image below:
the rightmost side shows total count of rows loaded.
Type of connection:
1) Live: The live connection fetches the data from the database or excel sheet in real time. If i delete sample-superstore excel sheet then I will not be able to see the data. This shows current and most up-to-date data.
2) Extract: This loads the data in memory and fetches it from there. This is faster than live connection. however it may show stale data.
Filter:
We can apply filter on the entire data source and thus restrict the data coming from the table.
Following image shows filter on Category field:
Wildcard:- Lets us choose start, end etc.
Condition:-
Top:- lets us choose top or bottom rows
Whenever or where ever we try to filter any field we come across same kind of pop-up with 4 tabs:
General, Wildcard, Condition, Top.
Once we have added a filter, it will show number of filters added and it will also give us option to edit it.
Keep in mind once we filter data here, for example, we choose only "Furniture" category and filter out rest of the categories then we wont be able to see other categories in any reports/sheets we generate.
4 May 2017
Tableau
Tableau is an analytics tool which allows us to visualize data in various formats in order to analyze business trends. The current version of Tableau is 10.2.
We can download tableau desktop from www.tableau.com.
The very first page tableau displays is the connect page. It shows the number of technologies/databases tableau can connect to.
Above image shows the types of databases we can connect to.
Data Source tab
Excel:
If we connect to Excel file,e.g. Sample- Superstore.xlsx, we will be directed to the next screen called Data source tab. On the Top left corner it shows the excel sheets name as Sample - Superstore. Right below it lists all the worksheets inside that excel sheet. e.g. Orders, People, Returns. Each worksheet is considered as one table in tableau. Following image shows us that Orders table is taken as a source here. Thus all column of orders table will be visible.
Preview data source button:
When we click this , column names, entire data is shown in the table as we generally see in database along with few extra symbols.
Manage Metadata Button:
This will show definition of the orders table. Field name(Name of the field as in Data Warehouse), Data type, remote field name(Name of column as in database).
There is a little downwards arrow on the field name, where we get lot of options like renaming the field name, Hide, Aliases, Create calculated field, Split etc.
Split option:
This option lets us separate a value as shown below.
Similarly, custom split can be used to split the values as shown below:
Data type:
Tableau automatically understands the type of a particular column and assigns a symbol to the field accordingly. In the image below, we can see that Order ID is shown as abc, Order date as Calender symbol, City as Earth symbol etc.
Tableau assigns following symbols to each datatype:
1) String: Abc symbol
2) Date: Calender symbol
3) Geographical column: Earth symbol
4) Numberic: # symbol
there is a small downwards arrow beside these symbols if we click that we will find similar options as we found on Manage metadata side.
We can also sort the data by a field as shown in the image below:
the rightmost side shows total count of rows loaded.
Type of connection:
1) Live: The live connection fetches the data from the database or excel sheet in real time. If i delete sample-superstore excel sheet then I will not be able to see the data. This shows current and most up-to-date data.
2) Extract: This loads the data in memory and fetches it from there. This is faster than live connection. however it may show stale data.
Filter:
We can apply filter on the entire data source and thus restrict the data coming from the table.
Following image shows filter on Category field:
Wildcard:- Lets us choose start, end etc.
Condition:-
Top:- lets us choose top or bottom rows
Whenever or where ever we try to filter any field we come across same kind of pop-up with 4 tabs:
General, Wildcard, Condition, Top.
Once we have added a filter, it will show number of filters added and it will also give us option to edit it.
Keep in mind once we filter data here, for example, we choose only "Furniture" category and filter out rest of the categories then we wont be able to see other categories in any reports/sheets we generate.
Comments
Post a Comment