![]() ![]() Instead, we’ll extend our solution just as the data set has been extended. And we can’t AGGREGATE by Position either because we’ll end up with an arbitrary value (should it be MIN or MAX – one of those might work in this specific case, but other combinations of Positions will have different results whatever we chose). We can’t use the previous approach to GROUP by Position because we’d get two records. So, we need to be able to keep the record with the latest hire date but we also need to keep the other values for that record intact. Now we know why Walvoord has two hire dates – it represents two different positions. Let’s extend the previous input data set so that not only is there a most recent hire date, but we can’t simply group and aggregate because there is additional data in the record that we need to retain. TABLEAU PREP SYSTEM ERROR HOW TOI’ve already covered how to get the latest snapshot of records elsewhere, but it fits nicely here because it is a case of duplicate data that needs to be deduped. We remove the unwanted near-duplicate record and end up with this output: That means for every unique set of employees, we’ll get the max hire date. It’s very similar to the previous solution, but here, we’re grouping only by Employee ID and Name while the Date Hired field gets aggregated as a MAX. You’ve probably already jumped to a possible solution for removing this duplicate data. In this case, Walvoord was hired once in 1997 and then subsequently re-hired in 2014 (this data set doesn’t indicate the reason – did he have an intermediate job or was it hire to a new position?) Whatever the cause for this data, let’s say we only care to know the most recent hire date. Let’s modify the data set slightly and consider how we might eliminate the duplicate records in Tableau Prep: (and thank you to Tom Fuller for pointing out this approach!)Ģ: Similar, but not exactly, Duplicate Records You can continue the Tableau Prep flow with a nicely deduped data set. ![]() Only unique rows are retained while duplicates will vanish before your eyes. No matter how many duplicate records you had, you could do the following in Tableau Prep:Īll you have to do is add an aggregate and add all fields to the Grouped Fields section. Everything is the same: the ID, the Name, the Date. You can see the Employee ID 3 has 2 exactly identical duplicate records. Let’s say you’ve got a data set that looks like this: Let’s talk about three possible cases of removing duplicate records in Tableau Prep (each gets a bit harder): 1: Exact Duplicate Records in Tableau Prep (And if you make it to the end, you’ll get a a bonus: An LOD calculation in Tableau Prep!) And not only that, but it’s very EASY to remove duplicates. BUT it is definitely possible to remove duplicate records in Tableau Prep. At first glance, the first version of Tableau Prep doesn’t seem to have this feature. One of the questions that comes up quite often is “How do I remove duplicate records in Tableau Prep?” or “How do I dedup data in Tableau Prep?” Some data preparation tools have a specific feature to do this. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |