RedShift: Script to Show Running Queries

Amazon Redshift, a powerhouse in cloud-based data warehousing, enables organizations to efficiently manage and analyze extensive datasets. For Redshift Database Administrators, understanding running queries is essential for monitoring performance, troubleshooting issues, and optimizing your database interactions.

This blog post offers a straightforward guide, providing both a SQL script and a graphical interface approach, to effortlessly uncover running queries, recent queries, and connections in AWS Redshift.

Show Running Queries in Redshift

Gain instant awareness into the queries currently executing in the Redshift Cluster, a key aspect of real-time performance monitoring.

-- Show running queries
SELECT user_name, db_name, pid, query
FROM stv_recents
WHERE status = 'Running';

/*
 user_name |   db_name   |  pid  |            query
-----------+-------------+-------+-----------------------------
 admin     | sample_db   | 12345 | SELECT * FROM important_table
 analyst   | analytics_db| 67890 | UPDATE customer_data SET status='active'
*/

Show Recent Queries in Redshift

Dig into historical query data in Redshift for a profound understanding of past activities and optimization opportunities.

-- Show recent queries (td, join with user table)
SELECT userid,query,pid,substring,starttime,endtime,elapsed,aborted
FROM svl_qlog 
ORDER BY starttime DESC
LIMIT 100;

/*
 userid |            query            |  pid  |   substring   |       starttime        |        endtime         | elapsed | aborted
--------+-----------------------------+-------+----------------+------------------------+------------------------+---------+---------
 54321  | SELECT * FROM public.orders | 13579 | SELECT * FROM p | 2022-09-27 15:30:00   | 2022-09-27 15:35:00   | 300000  | f
 98765  | INSERT INTO data_table ...   | 24680 | INSERT INTO dat | 2022-09-27 15:25:00   | 2022-09-27 15:30:00   | 300000  | f
*/

Show Recent Connections in Redshift

Stay informed about recent connections, gaining valuable insights into your Redshift Cluster access.

-- Show recent connections
SELECT recordtime, username, dbname, remotehost, remoteport
FROM stl_connection_log
WHERE event = 'initiating session'
AND pid NOT IN 
  (SELECT pid FROM stl_connection_log
   WHERE event = 'disconnecting session')
ORDER BY 1 DESC;

/*
      recordtime       |  username  |     dbname     |   remotehost   | remoteport
------------------------+------------+----------------+----------------+------------
 2022-09-27 16:00:00   | admin      | sample_db      | 192.168.1.100  | 5432
 2022-09-27 15:45:00   | analyst    | analytics_db   | 192.168.1.50   | 5432
*/

GUI Approach: AWS Redshift Management Console

For those who prefer a graphical interface, follow these steps using the AWS Redshift Management Console:

1. Open the AWS Management Console and navigate to the Redshift service.
2. Select the cluster that you want to view the running queries for.
3. In the cluster details page, click on the “Queries” tab.
4. This will open the Query Monitoring page, which shows a list of all queries that are currently running on your cluster.
5. You can use the filters at the top of the page to view only the queries that meet specific criteria, such as queries that are running for a certain amount of time, or queries that are using a specific database or user.
6. To view the details of a specific query, click on the query ID in the “ID” column. This will open the Query Details page, which shows information about the query, such as the query text, the start time, the status, and the query plan.
7. From the Query Details page, you can also cancel a running query by clicking on the “Cancel” button. This can be useful if you need to stop a long-running query.

Conclusion: Monitoring Redshift Performance

By using these Redshift commands, you effortlessly gain a comprehensive view of your cluster’s activities. Here’s a quick summary of each system tables/views being used:

# stv_recents: This table is a system view providing information about running queries, crucial for real-time monitoring. Each row represents a currently executing query, including user information, database, process ID (pid), and the actual query being executed.
# svl_qlog: This system view logs query execution details, including user ID, query text, process ID, substring, start time, end time, elapsed time, and whether the query was aborted. This is useful for historical query analysis.
# stl_connection_log: This table captures details about recent connections to the Redshift cluster. By filtering for ‘initiating session’ events and excluding ‘disconnecting session’ events, we obtain a list of ongoing connections.

By leveraging these Redshift commands, users can gain comprehensive visibility into their cluster’s activities, enhancing their ability to optimize performance and troubleshoot issues effectively. Whether querying real-time data or historical logs, Redshift offers a versatile range of tools to manage and monitor your cloud-based data warehouse.

For more information, explore the Redshift documentation on Monitoring an Amazon Redshift Cluster Performance.