Subscribe

Google BigQuery offers a number of data types that you can use to store your data. Having a firm understanding of these data types is necessary if you want to take full advantage of the warehousing tool's on-demand offerings and capabilities. By understanding the different data types available, you can choose the best option for storing your data in BigQuery.

BigQuery is used to store and query very large datasets, so it's important to understand the capabilities of the system before working with it. It can be overwhelming trying to keep track of all the different data types available in BigQuery. To help you out, we've compiled a list of the most common data types that you'll come across:

**Understanding the Different Google BigQuery Data Types**

**1. Numeric **

The NUMERIC data type is used to store numbers. This data type can store up to 38 digits of precision and 9 decimal places. Numbers stored as NUMERIC data types will have 64-bit precision.

**Arithmetic/ Mathematical Functions**

You can use the mathematical functions with the NUMERIC data type to perform various arithmetic operations. These include addition, subtraction, multiplication, division, modulo, exponential function, natural logarithm, square root, etc.

The following numerical functions can be performed on the NUMERIC data type:

ADD()

SUB()

MULTIPLY()

DIVIDE()

MOD()

EXP()

LN()

LOG2()

LOG10()

POW()

SQRT()

CEILING() and FLOOR()

ABS()

SIGN()

TRUNCATE()

**Standard Compliant Floating Point Division**

The IEEE_divide function can be used to avoid division by zero errors in SQL queries. This function returns NaN (not-a-number) when a division by zero is attempted, which enables the query to run without returning any errors. For example:

IEEE_divide(1,0) = NaN

IEEE_divide(2,3) = 0.666666

Using specific functions like IEEE_divide can help to prevent errors and ensure that SQL queries run smoothly.

**Safe Functions**

With the help of the SAFE command, you can return NULL for mathematical operations that would otherwise return an error. This is because the SAFE command enables you to perform undefined operations safely. For example, if you try to take the logarithm of a negative number, it will usually result in an error. However, with the SAFE command, it will simply return NULL. This can be extremely helpful in cases where you need to perform operations that may not have a defined answer.

**Precise Decimal Calculations with Numeric **

Numeric data types are essential for financial and accounting applications that require exact values for calculations. BigQuery uses 16 Bytes for storage and can represent decimal figures accurately, making it an ideal choice for such applications.

Let's say we want to perform a calculation and want our answer to be precise up to the 18th decimal place. We can use the NUMERIC data type with a scale of 18 to perform such precise calculations.

**2. Boolean**

The BOOLEAN data type is used to store true/false values. Values stored as BOOLEAN data type can be either 1 (true) or 0 (false).

**Logical Operations**

You can use logical operators with the BOOLEAN data type to perform various operations. These include: and, or, not.

The following logical operators can be performed on the BOOLEAN data type:

AND()

OR()

NOT()

These are generally used in SQL queries to filter data based on certain conditions.

**Conditional Operations**

Conditional operations are used to check if a condition is met and then return a certain value accordingly. If the condition is not met, then another value is returned.

The following is the syntax of a conditional operation that can be performed on the BOOLEAN data type:

CASE expr

WHEN expr_to_match THEN result

[ ... ]

[ ELSE else_result ]

**3. String**

BigQuery supports many functions for working with strings. Strings can be of variable length and must be quoted with single, double, or triple quotation marks. The following functions are available for string manipulation:

**Casting and Coercion**

The CAST function can be used to convert most data types from one type to another. However, using CAST may cause a query to fail if BigQuery is unable to perform the conversion. To avoid this, you can use SAFE_CAST. This will protect your queries from errors that could occur during casting.

BigQuery will automatically convert the result type of an argument expression to another type if needed to match function signatures. For example, if a function is defined to take a single argument of type FLOAT64 and an expression being used as an argument has a result type of INT64, the expression's result will be converted to the FLOAT64 type before func() is computed. This allows for greater flexibility when working with functions.

**Concatenation**

The CONCAT command is a useful tool for manipulating strings. It can be used to concatenate separate parts of different strings into one, making it easy to combine text from different cells or the results of mathematical operations. This can be helpful when trying to create a single string from multiple sources, or when needing to perform complex string operations in a concise way.

For example, let's say we have two cells, A1 and A2, and each cell contains a different word:

A1: Hello

A2: World

If we wanted to combine these two words into one cell, we could use the CONCAT command. The resulting cell would look like this:

HelloWorld

**Regular Expressions**

Regular expressions are an important skill for anyone working with data. They can be used to match, search and replace text, or to validate a string input. For example, you may want to use a regular expression to extract certain parts of a string, or to validate whether a string has a specific format. Regular expressions can also be used to replace or remove certain characters.

BigQuery supports RegEx functions like REGEXP_CONTAINS, REGEXP_EXTRACT, REGEXP_EXTRACT_ALL, REGEXP_REPLACE, etc which can be used to perform various operations on strings.

REGEXP_CONTAINS: This function returns true if the given string contains a match of the regular expression, or false otherwise.

REGEXP_EXTRACT: This function extracts the first matched sequence of characters that matches the regular expression pattern.

REGEXP_EXTRACT_ALL: This function extracts all matches of the regular expression from the given string.

REGEXP_REPLACE: This function replaces all matches of the regular expression in the given string with a replacement string. If there are no matches, it returns the original string.

**4. Bytes **

Although bytes and strings might look interchangeable, they are actually two different types that cannot be used interchangeably. Most functions that work on strings will also work on bytes, but the bytes version operates on raw data instead of Unicode characters.

It's important to remember that bytes are distinct from strings, and trying to use them interchangeably can cause errors. Bytes are represented using either single, double, or triple quotation marks, but must always begin with the letter B (for example: b'hello').

**5. Time **

The TIME data type represents a time of day, with no reference to a specific date. TIME values can range from 00:00:00 to 23:59:59.

BigQuery supports various data types for Time instances they include:

**TIME:** A TIME value represents a time of day, as might be displayed on a clock.

**TIMESTAMP: **For representing an absolute point in time, a TIMESTAMP value is used.

**6. Date **

If we want to represent a day, month and year, DATE values are used. DATE values can range from 0000-01-01 to 9999-12-31.

BigQuery supports various data types for Date instances they include:

**DATE: **A DATE value represents a day, with no reference to a specific time or time zone.

**DATETIME: **For representing an absolute point in time that includes both date and time components, DATETIME values are used.

**7. Geography **

Points, lines, and polygons on the surface of the earth can be represented using the Geography data type in BigQuery. Positions are represented in the World Geodetic System 1984 (WGS84) reference ellipsoid, which is also used by the Global Positioning System (GPS). This means that longitude and latitude can be used directly in BigQuery to find the location of a place.

The geography type is a collection of the following objects:

**Point geography:**

A point on the Earth's surface is defined by its latitude and longitude. Latitude is a measure of how far north or south of the equator a point is, and is measured in degrees. Longitude is a measure of how far east or west of the prime meridian a point is, and is measured in degrees (or sometimes in time). Together, latitude and longitude define a point's position on the Earth's surface.

**Linestring geography:**

A linestring is a sequence of points that connects to form a line. Unlike polygons, linestrings do not have defined sides. Instead, they are created by connecting a series of latitude and longitude coordinates to create a path.

**Polygon geography:**

The polygon is the most complex of the three geometry types. A polygon is defined by a sequence of points, which together form the polygon's exterior boundary. Each point in the boundary has a latitude and longitude, which together define its position on the Earth's surface. In addition to an exterior boundary, a polygon can also have one or more interior boundaries, which define holes in the polygon.

**Conclusion**

Datatypes are an important concept of BigQuery, and understanding how to use them is critical to getting the most out of the platform. In this article, we've looked at the different data types available in BigQuery, as well as how to convert between them.

Boltic is a freemium tool that makes it easy to work with BigQuery. With our easy-to-use interface, you can easily import data of any data source to BigQuery. Our team Sign up for Boltic free account and get all of the benefits of BigQuery without having to worry about the technical details.