Skip to content

sjsj0/dlq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Log Query (DLQ-G33)

Overview

DLQ-G33 is a distributed system for querying log files across multiple virtual machines (VMs) using Go and RPC. It enables users to run grep-like queries on logs stored on each VM, aggregate results, and validate outputs against ground truth logs.

Features

  • Distributed Grep: Each VM runs an RPC server to execute grep commands on its local log file.
  • Aggregation: A client sends queries to all VMs, collects results, and writes outputs to timestamped files.
  • Ground Truth Comparison: Unit tests compare distributed query outputs against precomputed logs for frequent, infrequent, and rare patterns.

Directory Structure

src/                 # Source code
  ground_truth/      # Ground truth logs for validation (frequent/infrequent/rare)
  main/
    main.go          # Main distributed grep program
    compare_test.go  # Unit tests for output validation
  output/            # Timestamped folders with query results
  logs/              # Raw log files for each VM

setup/               # Scripts for VM setup and orchestration

How to Run

1. VM Setup

On each VM, run the setup script to install dependencies and clone the repo:

cd setup
bash setup.bash

Or, to automate setup and startup across all VMs from one machine:

bash vm_setup.bash setup   # Setup all VMs
bash vm_setup.bash start   # Start servers on all VMs

2. Start the Distributed Grep Program

If the automation scripts were run, then an RPC server and REPL would already be running on each VM inside a tmux session. You can attach to the session by running:

sudo tmux -S /tmp/tmux-cs-425.sock attach -t cs-425-shared

Or run directly:

cd src/main
go run main.go

3. Query Logs

Run grep commands from the REPL prompt:

For example:

>> grep -i "PUT"
>> grep -i "griffin"
>> grep -i "156.98.57.6"
  • Results are saved in: src/output/<timestamp>/ (separate timestamped folder per query).
  • To navigate to the output folder run: cd ~/../mp1/dlq-g33/src/output/
  • Each timestamped folder will contain 10 files, each file containing the output from a particular VM.

Unit Testing

Compare Output with Ground Truth

Run unit tests to validate the latest output against ground truth logs:

cd src/main
GO111MODULE=off go test -v -args -category=frequent
GO111MODULE=off go test -v -args -category=infrequent
GO111MODULE=off go test -v -args -category=rare
  • The -category flag selects which ground truth set to use.
  • Tests compare line counts and content for grep query output for each VM.

Patterns

  • Frequent: grep -i "PUT"

  • Infrequent: grep -i "griffin"

  • Rare: grep -i "156.98.57.6"

About

Distributed Log Membership

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •