The Extract Tool is useful when you need to extract something from inside a column. For example, if you have a column containing people's names, the Extract Tool could be used to retrieve just the first name and put that in a new column. Or, if you have a column containing dates and you want to get just the year, the Extract Tool can pull out the year portion of the date.
The Extract Tool's options look like this:
To use the tool you specify the "Source Column", this is the column to extract from. You must then specify the name of the new column that you want to create and where in the source column to start and end the extract.
The Start dropdown has the following options:
- The beginning: Meaning, the beginning of the source column.
- At a specific position: Start the extract at a specific position, with 1 being the first letter or digit of the source column.
- At the first occurrence of: Start the extract at the first occurrence of a particular letter or number.
- After the first occurrence of: Start the extract after the first occurrence of a particular letter or number.
The End dropdown has a similar set of options for specifying where to end the extract:
- The end: Meaning, the end of the source column.
- At a specific position: End the extract at a specified position of the source column.
- At a specific length: Meaning, extract the exact number of letters or digits specified.
- At the next occurrence of: End the extract at the next occurence of a specific letter or number within the source column.
- After the next occurrence of: End the extract after the next occurrence of the specified number or letter.
When you use this tool to extract a portion from one column, the new column it creates is just like any other column in your table. You can use another Extract Tool on the new column if necessary to get at an inner portion of the extracted text, or use any of the other tools on your newly created column.
Useful Extract Recipes
The following are some useful recipes for extracting values from columns
Get the Hour from a Date
QueryTree can detect many forms of date in the data that is loaded into it. When it detects a date, QueryTree will display it in a standardised way:
2008-09-03 12:25:41
This consistent format for dates makes it simple to extract certain parts of a date using the Extract Tool. For example, to get the hour portion of the date, configure the Extract Tool as follows:
Start: After the first occurrence of
Start Text: " " (i.e. type a space into the Start Text edit box)
End: At the first occurrence of
End Text: ":"
The extracted column will now contain the hour from the date column.
Get the Domain (i.e. the website address) From an Email Address
If a column contains email addresses, extracting the domain (or website address) part of the email address can be done by setting up the Extract Tool as follows:
Start: After the first occurence of
Start Text: "@"
End: At the first occurrence of
End Text: " " (i.e. type a space into the End Text edit box)
The extracted column will now contain the domain part of the email addresses, e.g. "gmail.com".
Get the First Sentence from a Block of Text
To get the first sentence of a block of text, set up the Extract Tool as follows:
Start: The beginning
End: After the first occurrence of
End Text: "."
The extracted column will now contain the first sentence of the text in the source column.