Skip to content
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .coderabbit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
reviews:
path_filters: ["**/*.yml","**/*.yaml"]
7 changes: 7 additions & 0 deletions .semgrep.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# .semgrep.yml
include:
- semgrep/example.py

configs:
- semgrep/semgrep.yml

32 changes: 32 additions & 0 deletions semgrep/example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import os
import sys
import hashlib

# Hardcoded credentials
USERNAME = "admin"
PASSWORD = "secret123"
Comment on lines +5 to +7
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove hardcoded credentials - critical security vulnerability

Hardcoded credentials in source code pose a severe security risk. These credentials can be exposed through version control, code reviews, or if the source code is compromised.

Replace hardcoded credentials with environment variables or a secure configuration management system:

-# Hardcoded credentials
-USERNAME = "admin"
-PASSWORD = "secret123"
+# Load credentials from environment variables
+USERNAME = os.environ.get("APP_USERNAME")
+PASSWORD = os.environ.get("APP_PASSWORD")
+
+if not USERNAME or not PASSWORD:
+    raise ValueError("Missing required environment variables: APP_USERNAME, APP_PASSWORD")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In semgrep/example.py around lines 5 to 7, the USERNAME and PASSWORD are
hardcoded, which is a critical security vulnerability. Remove these hardcoded
credentials and instead load them from environment variables or a secure
configuration management system. Use appropriate methods to read environment
variables securely at runtime to avoid exposing sensitive information in the
source code.


def dangerous_eval():
user_input = input("Enter a Python expression: ")
result = eval(user_input)
print("Evaluated result:", result)
Comment on lines +9 to +12
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove eval() on user input - critical code injection vulnerability

Using eval() on user input allows arbitrary code execution, which can lead to complete system compromise. An attacker could execute any Python code, access files, or perform malicious operations.

If you need to evaluate mathematical expressions, use a safe alternative like ast.literal_eval() for literals only, or a proper expression parser:

+import ast
+
 def dangerous_eval():
-    user_input = input("Enter a Python expression: ")
-    result = eval(user_input)
-    print("Evaluated result:", result)
+    user_input = input("Enter a Python literal: ")
+    try:
+        # Only evaluates Python literals (strings, numbers, tuples, lists, dicts, booleans, None)
+        result = ast.literal_eval(user_input)
+        print("Evaluated result:", result)
+    except (ValueError, SyntaxError) as e:
+        print("Invalid input:", e)

For mathematical expressions, consider using a dedicated library like numexpr or simpleeval.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def dangerous_eval():
user_input = input("Enter a Python expression: ")
result = eval(user_input)
print("Evaluated result:", result)
import ast
def dangerous_eval():
user_input = input("Enter a Python literal: ")
try:
# Only evaluates Python literals (strings, numbers, tuples, lists, dicts, booleans, None)
result = ast.literal_eval(user_input)
print("Evaluated result:", result)
except (ValueError, SyntaxError) as e:
print("Invalid input:", e)
🤖 Prompt for AI Agents
In semgrep/example.py lines 9 to 12, the use of eval() on user input creates a
critical code injection vulnerability. Replace eval() with a safer alternative
such as ast.literal_eval() if only literals are expected, or use a dedicated
expression parsing library like numexpr or simpleeval to safely evaluate
mathematical expressions. This change will prevent arbitrary code execution from
user input.


def delete_data(path):
os.system("rm -rf " + path) # Semgrep: shell injection
Comment on lines +14 to +15
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix shell injection vulnerability in delete_data

Using os.system() with string concatenation allows shell injection attacks. An attacker could execute arbitrary shell commands by passing malicious input.

Use shutil.rmtree() for safe directory deletion, or subprocess.run() with a list of arguments:

+import shutil
+
 def delete_data(path):
-    os.system("rm -rf " + path)  # Semgrep: shell injection
+    # Validate path to prevent directory traversal
+    if not path or ".." in path:
+        raise ValueError("Invalid path")
+    
+    # Use shutil for safe file/directory deletion
+    if os.path.exists(path):
+        if os.path.isdir(path):
+            shutil.rmtree(path)
+        else:
+            os.remove(path)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def delete_data(path):
os.system("rm -rf " + path) # Semgrep: shell injection
import shutil
def delete_data(path):
# Validate path to prevent directory traversal
if not path or ".." in path:
raise ValueError("Invalid path")
# Use shutil for safe file/directory deletion
if os.path.exists(path):
if os.path.isdir(path):
shutil.rmtree(path)
else:
os.remove(path)
🤖 Prompt for AI Agents
In semgrep/example.py at lines 14 to 15, the use of os.system() with string
concatenation in delete_data creates a shell injection vulnerability. Replace
os.system("rm -rf " + path) with shutil.rmtree(path) to safely delete the
directory without invoking the shell, or alternatively use subprocess.run() with
a list of arguments to avoid shell interpretation. This change will prevent
arbitrary command execution from malicious input.


def hash_password(password):
hashed = hashlib.md5(password.encode()).hexdigest() # Semgrep: weak hash
return hashed
Comment on lines +17 to +19
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace MD5 with secure password hashing

MD5 is cryptographically broken and should never be used for password hashing. It's vulnerable to collision attacks and can be brute-forced easily.

Use a proper password hashing library like bcrypt:

-import hashlib
+import bcrypt
 
 def hash_password(password):
-    hashed = hashlib.md5(password.encode()).hexdigest()  # Semgrep: weak hash
-    return hashed
+    # Generate a salt and hash the password
+    salt = bcrypt.gensalt()
+    hashed = bcrypt.hashpw(password.encode('utf-8'), salt)
+    return hashed.decode('utf-8')

You'll need to install bcrypt: pip install bcrypt

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def hash_password(password):
hashed = hashlib.md5(password.encode()).hexdigest() # Semgrep: weak hash
return hashed
-import hashlib
+import bcrypt
def hash_password(password):
- hashed = hashlib.md5(password.encode()).hexdigest() # Semgrep: weak hash
- return hashed
+ # Generate a salt and hash the password
+ salt = bcrypt.gensalt()
+ hashed = bcrypt.hashpw(password.encode('utf-8'), salt)
+ return hashed.decode('utf-8')
🤖 Prompt for AI Agents
In semgrep/example.py around lines 17 to 19, the function uses MD5 for password
hashing, which is insecure. Replace the MD5 hashing with bcrypt by importing the
bcrypt library, then hash the password using bcrypt's hashing function. Make
sure to install bcrypt via pip before running the code.


def main():
print("Logging in as", USERNAME)
password_hash = hash_password(PASSWORD)
print("Password hash:", password_hash)
Comment on lines +21 to +24
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don't print sensitive credentials

Printing credentials to stdout can expose them in logs, terminals, or monitoring systems.

Remove or mask credential output:

 def main():
-    print("Logging in as", USERNAME)
+    print("Logging in as user")
     password_hash = hash_password(PASSWORD)
-    print("Password hash:", password_hash)
+    # Don't print password hashes
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def main():
print("Logging in as", USERNAME)
password_hash = hash_password(PASSWORD)
print("Password hash:", password_hash)
def main():
print("Logging in as user")
password_hash = hash_password(PASSWORD)
# Don't print password hashes
🤖 Prompt for AI Agents
In semgrep/example.py around lines 21 to 24, the code prints sensitive
credentials such as USERNAME and password hash to stdout, which can expose them
in logs or monitoring systems. Remove the print statements that output USERNAME
and password_hash or replace them with non-sensitive messages that do not reveal
credential details.


if len(sys.argv) > 1:
delete_data(sys.argv[1])
Comment on lines +26 to +27
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Validate command line arguments before file operations

Accepting unvalidated file paths from command line arguments is dangerous and could lead to unintended file deletion.

Add proper validation and confirmation:

     if len(sys.argv) > 1:
-        delete_data(sys.argv[1])
+        path = sys.argv[1]
+        # Validate path
+        if not os.path.exists(path):
+            print(f"Path does not exist: {path}")
+            return
+        
+        # Require confirmation for destructive operations
+        confirm = input(f"Are you sure you want to delete {path}? (yes/no): ")
+        if confirm.lower() == "yes":
+            delete_data(path)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if len(sys.argv) > 1:
delete_data(sys.argv[1])
if len(sys.argv) > 1:
path = sys.argv[1]
# Validate path
if not os.path.exists(path):
print(f"Path does not exist: {path}")
return
# Require confirmation for destructive operations
confirm = input(f"Are you sure you want to delete {path}? (yes/no): ")
if confirm.lower() == "yes":
delete_data(path)
🤖 Prompt for AI Agents
In semgrep/example.py around lines 26 to 27, the code uses a command line
argument directly for file deletion without validation, which is unsafe. Add
validation to check that the argument is a valid and expected file path before
calling delete_data. Additionally, implement a confirmation prompt to ensure the
user intends to delete the specified file, preventing accidental deletions.


dangerous_eval()

main()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use proper script execution guard

Direct execution of main() without a guard can cause issues when the module is imported.

-main()
+if __name__ == "__main__":
+    main()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
main()
if __name__ == "__main__":
main()
🤖 Prompt for AI Agents
In semgrep/example.py at line 31, the main() function is called directly without
a script execution guard. To fix this, wrap the main() call inside an if
__name__ == "__main__": block to ensure main() only runs when the script is
executed directly, preventing unintended execution when the module is imported.


8 changes: 8 additions & 0 deletions semgrep/semgrep.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
rules:
- id: hardcoded-password
pattern: password = "$SECRET"
message: "Avoid hardcoded passwords"
severity: ERROR
languages: [python]
metadata:
category: security
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Rule is overly narrow – catches only password = "$SECRET" literals

Today the rule fires only when
password = "$SECRET" matches verbatim. Hard-coded secrets usually vary:

db_password = "hunter2"
passwd = "super-secret"

Leverage metavariables + regex to generalise:

-    pattern: password = "$SECRET"
+    pattern: $PW_VAR = "$SECRET"
+    metavariables:
+      - metavariable: $PW_VAR
+        regex: (?i).*password.*

This still triggers on assignments to variables whose name contains “password” (case-insensitive) while keeping the same literal check.

🤖 Prompt for AI Agents
In semgrep/semgrep.yml around lines 2 to 8, the rule only matches the exact
string password = "$SECRET", which is too narrow. Modify the pattern to use
metavariables and a regex that matches any variable name containing "password"
(case-insensitive) assigned to any string literal, not just "$SECRET". This will
generalize the rule to catch more hardcoded password assignments.