Professionally organized notes for SPLK-1002 exam preparation.
π 1. Introduction to Splunk
==============================
1. Introduction to Splunk
==============================
What is Splunk?
---------------
Splunk is a powerful platform for searching, monitoring, and analyzing machine-generated big data via a web-style interface.
It stores, indexes, and correlates real-time data in a searchable repository from which it can generate graphs, reports,
alerts, dashboards, and visualizations.
Why Use Splunk?
---------------
- Centralized log analysis
- Real-time monitoring
- Powerful dashboards
- Alerting and automation
- Extensible via apps and add-ons
Splunk Components:
------------------
1. **Universal Forwarder (UF)** β Lightweight agent that sends logs to Splunk Indexer.
2. **Indexer** β Parses and indexes the incoming data.
3. **Search Head (SH)** β Frontend used to run searches and build visualizations.
4. **Deployment Server** β Manages configurations for multiple Splunk instances.
Data Flow in Splunk:
--------------------
1. Log sources β UF β Indexer β SH
2. Raw data β Parsing β Indexing β Searching β Reporting
Indexes:
--------
- Logical data storage locations (like folders)
- Default index: `main`
- Custom indexes can be created
Example Log (from secure.log):
------------------------------
`Jun 08 18:20:24 sshd[4747]: Failed password for invalid user john from 10.0.0.4 port 22`
Basic Search:
-------------
splunk
index=linux_logs sourcetype=secure.log "Failed password"
Exam Tips:
----------
π Understand each Splunk component and its role.
π Know the data flow and difference between UF, Indexer, and SH.
π Remember where parsing, indexing, and searching occur.
π 2. Navigating the Splunk Interface
==============================
2. Navigating the Splunk Interface
==============================
Overview:
---------
Splunk's Web Interface (Search Head) is where analysts perform searches, build dashboards, create alerts, and view visualizations.
Main UI Components:
-------------------
1. Search Bar β Where SPL queries are written.
2. Time Range Picker β Choose time windows like "Last 24 hours" or custom time.
3. Sidebar Panel β Displays Datasets, Reports, Alerts, Apps, and Settings.
4. Fields Panel β Shows all indexed and extracted fields for each event.
5. Events Viewer β Displays event logs with field highlighting.
Time Range Picker:
------------------
This is critical to scope your searches correctly.
Search Modes:
-------------
1. Fast β Fastest, skips field discovery.
2. Smart β Default mode, balances speed and field discovery.
3. Verbose β Slower, discovers all fields.
Field Discovery:
----------------
Selected Fields: _time, host, source, sourcetype
Interesting Fields: Splunk's suggested fields
Example:
--------
splunk
index=linux_logs sourcetype=secure.log "Failed password"
| stats count by user
Exam Tips:
----------
π Know what each UI panel is used for.
π Understand when to use Fast vs. Smart vs. Verbose search modes.
π The Time Picker greatly affects results β avoid forgetting to check it!
π 3. Time Ranges in Splunk
==============================
3. Time Ranges in Splunk
==============================
Overview:
---------
Time range selection is one of the most critical aspects of Splunk searches.
Time Picker Presets:
--------------------
- Last 15 minutes
- Last 24 hours
- Last 7 days
- Yesterday
- Real-time
Relative Time:
--------------
- `-1h@h` = 1 hour ago aligned to hour
- `-15m@m` = 15 minutes ago, aligned to minute
Time Modifiers in SPL:
----------------------
splunk
index=syslog earliest=-2h
index=syslog earliest="07/27/2025:08:00:00" latest="07/27/2025:10:00:00"
Real-Time Searches:
-------------------
- Live dashboarding
- Use with care (high system usage)
Example Query:
--------------
splunk
index=linux_logs sourcetype=secure.log "Failed password"
| stats count by src_ip
| where count > 10
earliest=-1h
Exam Tips:
----------
π Set the right time range before running queries.
π Know real-time vs. historical tradeoffs.
π Understand time modifiers (`earliest`, `latest`).
π 4. SPL Syntax and Search Pipeline
==============================
4. SPL Syntax and Search Pipeline
==============================
Overview:
---------
SPL (Search Processing Language) is how you query data in Splunk.
Structure:
----------
Each command is separated by a pipe (`|`) symbol.
Example:
--------
splunk
index=web sourcetype=access_combined
| stats count by status
Command Types:
--------------
- Search: `index=main`
- Transforming: `stats`, `chart`, `timechart`
- Filtering: `where`, `fields`, `dedup`
- Eval: `eval`, `if`, `case`
- Format: `table`, `sort`
Example Query:
--------------
splunk
index=web sourcetype=access_combined
| eval is_error=if(status>=400, "yes", "no")
| stats count by is_error
Real-World Example:
-------------------
splunk
index=linux_logs sourcetype=secure.log "Failed password"
| eval day=strftime(_time, "%A")
| stats count by day, user
Exam Tips:
----------
π SPL syntax is case-sensitive.
π Donβt forget the `|` between commands.
π Understand the role of each command type in the pipeline.
π 5. Using Fields and Field Extraction
==============================
5. Using Fields and Field Extraction
==============================
Overview:
---------
Fields are key-value pairs extracted from event data. Splunk automatically extracts some fields and allows manual extractions.
Types of Fields:
----------------
- Default Fields: _time, host, source, sourcetype
- Indexed Fields: Extracted at index time (e.g., host)
- Search-time Fields: Extracted when a search is run (e.g., status)
Field Panels:
-------------
- Selected Fields: Always shown in UI
- Interesting Fields: Frequently occurring in current results
Field Extraction Methods:
-------------------------
1. Interactive Extraction β via UI (Settings > Fields > Field Extractions)
2. Using `rex` β Regular expression based extraction
3. Using `spath` β Extract fields from JSON logs
Example using `rex`:
---------------------
index=linux_logs sourcetype=secure.log
| rex "Failed password for (?<user>\w+) from (?<ip>\d+\.\d+\.\d+\.\d+)"
Example using `spath` (for JSON):
---------------------------------
index=api sourcetype=json_logs
| spath input=payload path=user.id output=user_id
Best Practices:
---------------
- Use `rex` for unstructured logs
- Use `spath` for JSON or XML
- Avoid extracting the same field multiple times
Exam Tips:
----------
π Understand difference between indexed vs search-time fields.
π Practice both `rex` and `spath` syntax.
π Know where to configure field extractions in the UI.
π 6. Using Search Modes
==============================
6. Using Search Modes
==============================
Overview:
---------
Search Modes determine how much field discovery Splunk performs, which affects speed and detail.
Modes:
------
1. Fast β Minimal field extraction; fastest.
2. Smart β Balanced; default mode.
3. Verbose β Maximum field extraction; slowest.
When to Use:
------------
- Fast: For saved reports, known fields
- Smart: General searching
- Verbose: Exploratory searching
Comparison Table:
-----------------
Mode | Field Discovery | Speed
---------|------------------|-------
Fast | Minimal | π₯ Fast
Smart | Conditional | βοΈ Balanced
Verbose | Full | π’ Slow
Exam Tips:
----------
π Know when to switch modes.
π Verbose is needed for field discovery.
π Smart adjusts based on pipeline usage.
π 7. Transforming Commands
==============================
7. Transforming Commands
==============================
Overview:
---------
Transforming commands are used to calculate statistics and create charts or time-based trends.
Common Commands:
----------------
1. stats β Aggregates data
2. chart β Like stats but output in table format
3. timechart β Time-based trends
Examples:
---------
index=web sourcetype=access_combined
| stats count by status
index=web sourcetype=access_combined
| chart avg(bytes) over status by host
index=web
| timechart span=1h count by status
Transforming Functions:
-----------------------
- count
- avg
- sum
- min
- max
- dc (distinct count)
- values (list unique)
Best Practices:
---------------
- Use timechart when _time is needed
- Use dc(field) for distinct users/IPs
- Always verify fields exist before using them
Exam Tips:
----------
π Understand difference between stats, chart, and timechart.
π Know transforming functions (avg, dc, sum, etc.).
π Timechart requires _time field.
π 8. Data Visualizations & Dashboards
==============================
8. Data Visualizations & Dashboards
==============================
Overview:
---------
Dashboards visualize search results using charts, tables, and gauges.
Common Visualization Types:
---------------------------
- Column and Bar charts
- Line and Area charts
- Pie and Scatter plots
- Single value, Gauge
Creating Dashboards:
--------------------
- Use "Save As > Dashboard Panel" after running a search.
- Combine multiple panels in one dashboard.
Modifying Panels:
-----------------
- Change chart type, title, color scheme
- Use tokens to pass values between inputs and panels
Best Practices:
---------------
- Use dropdown filters for interactivity
- Title each panel meaningfully
- Donβt overload with too many panels
Exam Tips:
----------
π You can save searches as dashboard panels.
π Know the types of visualizations.
π Use dynamic filters and inputs for reusability.
π 9. Creating and Using Reports
==============================
9. Creating and Using Reports
==============================
Overview:
---------
Reports are saved searches that can be scheduled and shared.
Creating a Report:
------------------
- Run a search
- Click "Save As > Report"
- Set a title, description, permissions
Scheduling:
-----------
- You can schedule reports to run at set intervals
- Set actions like email, PDF export, alert trigger
Managing Reports:
-----------------
- Go to Settings > Searches, Reports, Alerts
- Modify permissions, owners, schedule
Difference from Dashboards:
---------------------------
Feature | Report | Dashboard
------------|-------------------------|-------------------------
Purpose | Scheduled results | Interactive view
Output | Table or chart | Multiple visual panels
Scheduling | Yes | No (but can refresh)
Exam Tips:
----------
π Reports are saved searches.
π You can schedule and share reports.
π Reports can send emails or trigger alerts.
π 10. Alerts and Scheduled Searches
==============================
10. Alerts and Scheduled Searches
==============================
Overview:
---------
Alerts are saved searches with conditions that notify you when triggered.
Creating an Alert:
------------------
- Run a search
- Click βSave As > Alertβ
- Set trigger condition (number of results, custom logic)
- Choose actions: email, webhook, script
Alert Types:
------------
- Real-time: Triggered as soon as condition met
- Scheduled: Runs at intervals and checks for match
Trigger Conditions:
-------------------
- Per-result (trigger for each event)
- Number of results (e.g. >100 errors)
Actions:
--------
- Send email
- Webhook
- Log to index
- Run script
Best Practices:
---------------
- Avoid real-time unless truly needed
- Use summary indexing for frequent alerts
- Include enough info in alert email
Exam Tips:
----------
π Know difference between real-time vs scheduled.
π Understand how to configure trigger conditions.
π Alerts are just scheduled searches with actions.
π 11. Event Types and Tags
==============================
11. Event Types and Tags
==============================
Overview:
---------
Event types group similar events under a name, allowing easier reuse.
Creating Event Types:
---------------------
- Search for logs
- Click "Save As > Event Type"
- Provide a name and optional tag
Tags:
-----
- Labels applied to field values or event types
- Help categorize data (e.g., tag IPs as internal/external)
Example:
--------
`tag=authentication` could include event types like `login_success` and `login_failure`
Best Practices:
---------------
- Use consistent naming
- Combine tags with lookups for context
Exam Tips:
----------
π Event types are named saved searches.
π Tags help group events logically.
π Tags are useful for CIM and accelerated datasets.
π 12. Lookups and Field Enrichment
==============================
12. Lookups and Field Enrichment
==============================
Overview:
---------
Lookups enrich event data by matching fields with external CSV or KV store.
Types of Lookups:
-----------------
1. **File-based (.csv)**
2. **External (scripts)**
3. **KV Store (indexed DB)**
Common Commands:
----------------
- inputlookup β view lookup contents
- lookup β enrich events
- outputlookup β write results
Example:
--------
index=web | lookup ip2location ip AS client_ip OUTPUT city, country
Automatic Lookups:
------------------
- Apply based on sourcetype
- Configured under Settings > Fields > Lookup Definitions
Best Practices:
---------------
- Use lookups to map codes, geo info, user info
- Keep lookup file updated
Exam Tips:
----------
π Understand inputlookup vs lookup vs outputlookup.
π Know where automatic lookups are defined.
π Know CSV formatting and matching fields.
π 13. Calculated Fields, Aliases, and Field Extractions
==============================
13. Calculated Fields, Aliases, and Field Extractions
==============================
Overview:
---------
Splunk lets you create fields dynamically to simplify searches and improve performance.
Calculated Fields:
------------------
- Use eval expressions to define new fields
- Applied at search-time
Field Aliases:
--------------
- Rename fields without changing underlying data
- Example: rename clientip to ip_address
Field Extractions:
------------------
- Use regex or delimiters to define fields
- Created via UI or props.conf
Exam Tips:
----------
π Calculated fields use eval.
π Field aliases map one field name to another.
π Field extractions = making fields from raw logs.
π 14. Splunk Knowledge Objects Summary
==============================
14. Splunk Knowledge Objects Summary
==============================
Overview:
---------
Knowledge Objects are reusable components that enhance Splunk functionality.
Key Objects:
------------
- Event Types
- Tags
- Lookups
- Reports
- Alerts
- Dashboards
- Data Models
- Field Extractions
- Saved Searches
Management:
-----------
- Settings > Knowledge
- Permissions control sharing (Private, App, Global)
Best Practices:
---------------
- Use naming conventions
- Tag and organize for reuse
Exam Tips:
----------
π Know which object is used where.
π Permissions and ownership impact usage.
π All objects are found in Settings > Knowledge.
π 15. Combined Exam Tips (All Sections)
==============================
15. Combined Exam Tips (All Sections)
==============================
This section consolidates the most important exam tips scattered across all prior sections and lecture screenshots.
General Exam Tips:
------------------
β
Understand the architecture β role of Indexer, Search Head, Universal Forwarder
β
Know the difference between real-time, scheduled, and historical searches
β
Use Time Picker wisely β avoid querying too much data
β
SPL is case-sensitive β especially field names
β
Syntax errors often stem from missing `|` or incorrect field references
β
Save time by knowing when to use Fast, Smart, or Verbose search modes
β
Pay attention to default vs. interesting fields in the Fields panel
β
Practice regex (`rex`) and JSON field extraction (`spath`)
β
Use `eval` to create dynamic fields and `stats` for summary views
β
`timechart` always needs `_time` field
β
Reports vs Dashboards: Reports are for static outputs, Dashboards are for interactive visualization
β
Alerts are scheduled searches with trigger conditions and actions
β
Lookup usage is critical β understand `inputlookup`, `lookup`, and `outputlookup`
β
Knowledge objects and their permissions (private, app, global) frequently appear in exams
Pro Tips:
----------------------
β
Pivot allows visualization without writing SPL β good for business users
β
Accelerated Datasets enhance dashboard speed β ideal for scheduled panels
β
Use calculated fields instead of rewriting SPL every time
β
Donβt mix index-time and search-time field logic in same query
β
Tags and event types are critical for data model mapping and CIM compliance
β
Real-time alerts are costly β prefer scheduled unless justified
β
Field aliasing is useful when dealing with multiple sourcetypes
β
Use summary indexing to reduce computation for frequent reports/alerts
β
Use dropdowns and dynamic filters in dashboards to enhance usability
β
Use `dc()` for distinct count and `values()` to list unique items
Suggested Strategy for Exam:
----------------------------
π§ Memorize SPL syntax and functions: `stats`, `eval`, `dedup`, `chart`, `table`, `sort`, `rename`
π§ͺ Practice queries using provided sample logs (e.g., `secure.log`)
π§© Use scenario-based logic: Know what search should be used to troubleshoot login issues or network errors
π Practice building dashboards from raw searches
ποΈ Understand the difference between fields, tags, event types, and calculated fields
Recommended Practice:
---------------------
- Write at least 50 SPL queries using transforming + filtering commands
- Create a dashboard with at least 3 panels: timechart, bar, and single-value
- Configure a scheduled alert with condition >10 failed login attempts in 1h
- Perform a lookup join with external CSV data
- Use `rex` to extract usernames from secure.log manually
π 16. Syntax Memorization & SPL Restrictions
==============================
16. Syntax Memorization & SPL Restrictions
==============================
This section provides a one-stop reference to memorize SPL (Search Processing Language) syntax and highlights key usage restrictions, caveats, and best practices.
-----------------------------
π€ Case Sensitivity
-----------------------------
- SPL command names β **NOT** case-sensitive (e.g., `stats`, `STATS`, `Stats` all work)
- **Field names** β β
Case-sensitive (`status` β `Status`)
- **String literals** in eval or where β β
Case-sensitive (`"error"` β `"ERROR"`)
-----------------------------
π Command Placement & Pipes
-----------------------------
- Each SPL command is separated by a `|` (pipe)
- Commands must follow logical sequence:
- Search first
- Filtering / `eval`
- Transforming / `stats`, `chart`
- Formatting / `table`, `sort`
Wrong:
splunk
| table host | index=main
Correct:
splunk
index=main | table host
-----------------------------
π `stats` vs `chart` vs `timechart`
-----------------------------
- `stats` β General aggregation (no axis requirements)
splunk
index=web | stats count by status
- `chart` β Requires:
- `OVER <field>` β x-axis
- `BY <field>` β data series
splunk
index=web | chart avg(bytes) over status by host
- `timechart` β Requires `_time` field
splunk
index=web | timechart span=1h count by status
Restrictions:
-------------
- `chart` must use either `over` or `by`, not both together unless explicitly supported
- `timechart` **only supports 1 BY field** for splitting series
-----------------------------
π§ Eval & Conditional Logic
-----------------------------
Eval creates or modifies fields dynamically.
Examples:
splunk
| eval error=if(status>=400, "yes", "no")
| eval user_type=case(role="admin", "privileged", role="guest", "limited")
Restrictions:
- Use `==` or `!=` for string equality, not `=`
- Always enclose string comparisons in `"double quotes"`
-----------------------------
π§Ή Filtering Commands
-----------------------------
- `where` β Filters rows based on conditions
- `search` β Can be used inline for match
splunk
| where status=404
| search user=admin
Restrictions:
- `where` uses eval-style logic
- `search` uses keyword-based match
-----------------------------
π Dedup, Sort, Rename, Table
-----------------------------
- `dedup <field>` β Remove duplicate values by field
- `sort` β Order rows (default 10000 limit)
splunk
| sort - _time
- `rename <old> AS <new>` β Rename fields
- `table <field1> <field2>` β Output clean column display
-----------------------------
π Lookup Syntax
-----------------------------
splunk
| lookup ip_lookup ip AS client_ip OUTPUT location
Restrictions:
- Field names must match case exactly
- Lookup file must be defined in `Settings > Lookups`
-----------------------------
π Summary
-----------------------------
Command | Purpose | Notes
----------------|--------------------------------|-----------------------------
`eval` | Create fields | Case-sensitive values
`stats` | Aggregate | Multiple fields OK
`chart` | Visual summary | Use `over` / `by` with care
`timechart` | Time-series graph | Needs `_time`
`dedup` | Remove dup rows | 1 field only
`where` | Conditional filter | Uses eval syntax
`search` | Keyword-based filter | Simple text match
`table` | Format as columns | Final display
`sort` | Sort rows | Default limit 10000
`lookup` | Join external data | Case-sensitive
Exam Tips:
----------
π Field names = case-sensitive
π Functions and commands = case-insensitive
π Understand OVER vs BY
π SPL logic: Search β Filter β Eval β Transform β Format
π 17. SPL Commands, Purpose, and Usage Reference
==============================
17. SPL Commands, Purpose, and Usage Reference
==============================
This section provides an organized command reference for all important SPL (Search Processing Language) commands covered in Sections 1β14, including advanced commands like `transaction`.
Each entry includes:
- β
Purpose
- π οΈ When to Use
- π§ͺ Example
-----------------------------------------
π 1. `search`
-----------------------------------------
β
Filters raw events based on keywords or field=value.
π οΈ First command in any SPL query.
splunk
index=linux_logs "Failed password"
-----------------------------------------
π 2. `stats`
-----------------------------------------
β
Aggregates data (count, avg, sum, etc.)
π οΈ Group by fields using `by`
splunk
| stats count by status
-----------------------------------------
π 3. `timechart`
-----------------------------------------
β
Time-based aggregation
π οΈ Needs `_time` and optional `by` field
splunk
| timechart span=1h count by host
-----------------------------------------
π 4. `chart`
-----------------------------------------
β
Produces chart-style output
π οΈ Use `over` for x-axis and `by` for series
splunk
| chart avg(bytes) over status by host
-----------------------------------------
π 5. `eval`
-----------------------------------------
β
Creates or modifies fields
π οΈ Use for conditional logic or transformation
splunk
| eval is_error=if(status>=400, "yes", "no")
-----------------------------------------
π 6. `where`
-----------------------------------------
β
Filters events using eval-style conditions
π οΈ Use after stats/eval
splunk
| where status=404
-----------------------------------------
π 7. `table`
-----------------------------------------
β
Formats results into a clean table
π οΈ Use at end of search
splunk
| table user, ip, status
-----------------------------------------
π 8. `sort`
-----------------------------------------
β
Orders results
π οΈ Use `+` or `-` for ascending/descending
splunk
| sort - _time
-----------------------------------------
π 9. `dedup`
-----------------------------------------
β
Removes duplicate rows by field
π οΈ Retains first instance only
splunk
| dedup user
-----------------------------------------
π 10. `fields`
-----------------------------------------
β
Includes or excludes fields
splunk
| fields host, source
-----------------------------------------
π 11. `top` / `rare`
-----------------------------------------
β
Lists most or least common values
splunk
| top status
| rare user
-----------------------------------------
π 12. `rex`
-----------------------------------------
β
Extracts fields using regex
splunk
| rex "user=(?<username>\w+)"
-----------------------------------------
π 13. `spath`
-----------------------------------------
β
Extracts fields from JSON/XML
splunk
| spath input=data path=payload.id output=uid
-----------------------------------------
π 14. `lookup`
-----------------------------------------
β
Joins with external data
splunk
| lookup geo_lookup ip AS src_ip OUTPUT city
-----------------------------------------
π 15. `inputlookup` / `outputlookup`
-----------------------------------------
β
Reads/Writes lookup files
splunk
| inputlookup users.csv
| outputlookup filtered_users.csv
-----------------------------------------
π 16. `transaction`
-----------------------------------------
β
Groups events that belong to the same session or activity
π οΈ Used to track multistep processes (e.g., login β logout)
splunk
| transaction user startswith="login" endswith="logout"
π§ Groups by user, within a time window.
-----------------------------------------
π 17. `rename`
-----------------------------------------
β
Renames a field
splunk
| rename clientip AS ip_address
-----------------------------------------
π 18. `fillnull`
-----------------------------------------
β
Fills NULL values
splunk
| fillnull value="N/A"
-----------------------------------------
π 19. `join`
-----------------------------------------
β
Joins two datasets on a field
splunk
search1 | join user [search2]
-----------------------------------------
π 20. `append` / `appendcols`
-----------------------------------------
β
Combines multiple searches
splunk
search1 | append [search2]
-----------------------------------------
π OVER vs BY Summary
-----------------------------------------
| Feature | `by` (used in) | `over` (used in) |
|------------------|-------------------------------|---------------------|
| Grouping logic | stats, timechart | chart |
| Axis role | Grouped rows | X-axis rows |
| Series support | Yes (`by` supports multiple) | Over only 1 |
| Used together? | β
In `chart` | β
In `chart` |
- Practice Questions (Set 1)
- Exam Mode (Set1)
- Practice Questions (Set 2)
- Exam Mode (Set1)
- Practice Questions (Set 3)
- Exam Mode (Set1)
β
Prepared for certification + real-world analyst usage
π Includes: Search commands, Dashboards, Alerts, Knowledge objects