Skip to content

FEAT: dashboard #93

@UranusSeven

Description

@UranusSeven

Is your feature request related to a problem? Please describe

A dashboard will provide the necessary monitoring and performance metrics, enabling efficient management and optimization of our system.

Describe the solution you'd like

Below are the key features I envision for a dashboard:

  1. Resource Monitoring
  • CPU: Real-time monitoring of CPU utilization across the distributed system nodes.
  • Memory: Tracking and visualization of memory usage for each node.
  • GPU: Monitoring GPU utilization, allowing us to identify bottlenecks or optimize resource allocation.
  • VRAM: Real-time monitoring of VRAM utilization for GPU-based inference.
  1. Performance Monitoring:
  • Model-Specific Metrics: For each deployed model, capture and display relevant metrics such as generate task queue length, number of tokens generated per second, and any other model-specific performance indicators.
  • Throughput and Latency: Measure the overall throughput and latency of the system, enabling us to identify any performance issues and assess system efficiency.

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions