The 5 minute introduction to Osquery


Osquery is a an awesome host instrumentation framework from Facebook. It can instrument Mac, Linux, and Windows servers. It organises system data in tables that you can query using your favourite query language – SQL. It is SQL for your infrastructure. You can query for system intruders, system information, compliance, installed apps, running processes, and many more data points.

Osquery uses SQLite syntax for SQL. So, if you need more information about SQL syntax outside of what is covered in osquery documentation then you should give SQLite documentation a read.

To use Osquery for monitoring your servers you need to install osquery agent. Once that done, you can make SQL queries to fetch relevant information.

Osquery is built on 4 design principles:

  1. Simple: This means users of the tool should be work with high level abstractions that are easy to use, deploy, and maintain.
  2. Performant and reliable: This means services should not be impacted by osquery consuming more resources than required.
  3. Easy to integrate: It should be a good citizen so that it can integrate with existing infrastructure.
  4. Flexible: It means osquery should be flexible to meet different use cases like intrusion detection, vulnerability management, compliance, or any other use case specific to end user domain.

Why osquery?

The following are the main reasons why you would want to use osquery:

  1. Osquery expose system information as a relational database that you can query using SQL. Anyone with the basic knowledge of SQL can start using it in minutes.
  2. Osquery is extensible. You can write tables if they currently does not exist.
  3. Because Osquery uses SQL you can join multiple tables together to perform detailed analysis.
  4. Osquery is cross platform. This means you can use a single tool to work with different OS
  5. Osquery is open source and there is a big community behind it.

Installation

You can download Osquery binary from the official page.

If you are using Mac then you can also use brew to download and install osquery.

brew update
brew install osquery

Osquery has three main components:

  • osqueryi: The interactive osquery shell, for performing ad-hoc queries
  • osqueryd: A daemon for scheduling and running queries in the background.
  • osqueryctl: A helper script for testing a deployment or configuration of osquery.

You can now login to osquery shell by typing osqueryi

osqueryi

Once inside the shell you will see following

Using a virtual database. Need help, type '.help'
osquery>

For some queries that require system level access you need to run with sudo

10 things you can do with Osquery

The best way to learn a tool is to use it. So, let’s osquery by asking few questions.

Before we start querying let’s query all the osquery tables. To do that you will type .tables command.

osquery> .tables

Question 1: List system information

select cpu_type, cpu_brand, hardware_vendor, hardware_model from system_info;

The output on my machine is following

+----------+------------------------------------------+-----------------+----------------+
| cpu_type | cpu_brand                                | hardware_vendor | hardware_model |
+----------+------------------------------------------+-----------------+----------------+
| x86_64h  | Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz | Apple Inc.      | MacBookPro15,1 |
+----------+------------------------------------------+-----------------+----------------+

Question 2: Find remaining hard disk storage

select path, type, round((blocks_available * blocks_size *10e-10),2) as gigs_free from mounts where path='/';

The output on my machine is following

+------+------+-----------+
| path | type | gigs_free |
+------+------+-----------+
| /    | apfs | 95.66     |
+------+------+-----------+

Question 3: Find apps installed on machine

select display_name, bundle_short_version, bundle_version from apps limit 1;

The output on my machine is following

+------------------------------+----------------------------+----------------------+
| display_name                 | bundle_short_version       | bundle_version       |
+------------------------------+----------------------------+----------------------+
|                              |                            | 19.021.20047         |
|                              |                            | 19.021.20047         |
| AdobeResourceSynchronizer    | 19.021.20047               | 19.021.20047         |
|                              | 19.021.20056               | 19.021.20056         |

Question 4: Find the process running on port 8080

select pid from listening_ports where port = 8080;

The output of the query

+-------+------+----------+--------+---------+-----+---------------------+------+
| pid   | port | protocol | family | address | fd  | socket              | path |
+-------+------+----------+--------+---------+-----+---------------------+------+
| 10890 | 8080 | 6        | 10     | 0.0.0.0 | 102 | 7244862416501030093 |      |
+-------+------+----------+--------+---------+-----+---------------------+------+

We can also join it with processes table to get more information

osquery> select p.pid, p.name, p.state,p.uid, lp.port from processes p join listening_ports lp on p.pid = lp.pid and lp.port=8080;
+-------+------+-------+-----+------+
| pid   | name | state | uid | port |
+-------+------+-------+-----+------+
| 16735 | java | R     | 502 | 8080 |
+-------+------+-------+-----+------+

Question 5: Find top 5 most CPU intensive processes

SELECT pid, uid, name, ROUND((
  (user_time + system_time) / (cpu_time.tsb - cpu_time.itsb)
) * 100, 2) AS percentage
FROM processes, (
SELECT (
  SUM(user) + SUM(nice) + SUM(system) + SUM(idle) * 1.0) AS tsb,
  SUM(COALESCE(idle, 0)) + SUM(COALESCE(iowait, 0)) AS itsb
  FROM cpu_time
) AS cpu_time
ORDER BY user_time+system_time DESC
LIMIT 5;

On my machine answer is following

+-------+-----+----------------------------+------------+
| pid   | uid | name                       | percentage |
+-------+-----+----------------------------+------------+
| 424   | 502 | Google Chrome              | 73.26      |
| 788   | 502 | com.docker.hyperkit        | 72.16      |
| 506   | 502 | Google Chrome Helper (GPU) | 64.36      |
| 507   | 502 | Google Chrome Helper       | 16.14      |
| 18820 | 502 | Evernote                   | 8.12       |
+-------+-----+----------------------------+------------+

Question 6: Find top 5 most memory intensive processes

SELECT pid, name, ROUND((total_size * '10e-7'), 2) AS used FROM processes ORDER BY total_size DESC LIMIT 5;

On my machine answer is following

+-------+----------------------------+---------+
| pid   | name                       | used    |
+-------+----------------------------+---------+
| 788   | com.docker.hyperkit        | 1672.31 |
| 10016 | idea                       | 1617.92 |
| 506   | Google Chrome Helper (GPU) | 1536.51 |
| 18820 | Evernote                   | 1131.83 |
| 2131  | iTerm2                     | 465.49  |
+-------+----------------------------+---------+

Question 7: Find who is logged into the system

select * from logged_in_users;

The output on my machine is following

+------+---------------+---------+------+------------+-------+
| type | user          | tty     | host | time       | pid   |
+------+---------------+---------+------+------------+-------+
| user | shekhargulati | console |      | 1574972872 | 191   |
| user | shekhargulati | ttys001 |      | 1575235747 | 2135  |
| user | shekhargulati | ttys002 |      | 1575236766 | 13669 |
+------+---------------+---------+------+------------+-------+

Question 8: List Python packages installed in default Python

select name, version from python_packages;

The output on my machine is following

+--------------+---------+
| name         | version |
+--------------+---------+
| Jinja2       | 2.10    |
| MarkupSafe   | 1.1.1   |
| PyNaCl       | 1.3.0   |
| ipaddress    | 1.0.22  |
| jmespath     | 0.9.4   |
| paramiko     | 2.4.2   |
| pyasn1       | 0.4.5   |
| pycparser    | 2.19    |
| urllib3      | 1.25.3  |
+--------------+---------+

Question 9: Find the last docker image installed in my local registry

select id,tags from docker_images order by created desc limit 1;

The output on my machine is following

+------------------------------------------------------------------+---------------------------+
| id                                                               | tags                      |
+------------------------------------------------------------------+---------------------------+
| 460f8b98db51e0f35d4eea4416907635ff207d73239214926ebce71ed2ac785c | image-resolver-api:latest |
+------------------------------------------------------------------+---------------------------+

Question 10: Finding processes that are running whose binary has been deleted from the disk

Attackers will leave a malicious process running but delete the original binary on disk.

SELECT name, path, pid FROM processes WHERE on_disk = 0;

There is no process running on my machine whose binary has been deleted.

How it works?

Osquery doesn’t just use SQLite syntax, the query engine is SQLite. Osquery gets all of the query parsing, optimization and execution functionality from SQLite, enabling the project to focus on finding the most relevant sources for instrumentation data.

It’s important to mention that, while osquery uses the SQLite query engine, it does not actually use SQLite for data storage. Most data is generated on-the-fly at query execution time through a concept we call “Virtual Tables”. Osquery does need to store some data on the host, and for this it uses an embedded RocksDB database

Things to keep in mind

  • Queries can cause instance OOM
  • Queries can increase CPU utilization
  • You can have overlapping queries if your one query does not finish before second starts

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s