Fast Python Screen Capture for Windows - Updated 2026
import dxcam
with dxcam.create() as camera:
frame = camera.grab()Live API Docs: https://ra1nty.github.io/DXcam/
DXcam is a high-performance python screenshot and capture library for Windows based on the Desktop Duplication API. It is designed for low-latency, high-FPS capture pipelines (including full-screen Direct3D applications).
Compared with common Python alternatives, DXcam focuses on:
- Higher capture throughput (240+fps on 1080p)
- Stable capture for full-screen exclusive Direct3D apps
- Better FPS pacing for continuous video capture
- Support DXGI / Windows Graphics Capture dual backend
- Seamless integration for AI Agent / Computer Vision use cases.
Minimal install:
pip install dxcamFull feature: (includes OpenCV-based color conversion, WinRT capture backend support:):
pip install "dxcam[cv2,winrt]"Notes:
- Official Windows wheels are built for CPython
3.10to3.14. - Binary wheels include the Cython kernels used by processor backends.
Please refer to CONTRIBUTING.
Contributions are welcome! Development setup and contributor workflow are documented in CONTRIBUTING.md.
Each output (monitor) is associated with one DXCamera instance.
import dxcam
camera = dxcam.create() # primary output on device 0To specify backends:
camera = dxcam.create(
backend="dxgi", # default Desktop Duplication backend
processor_backend="cv2" # default OpenCV processor
)frame = camera.grab()grab() returns a numpy.ndarray. None if no new frame is available since the last capture (for backward compatibility); use camera.grab(new_frame_only=False) to make dxcam always return the latest frame.
Use copy=False (or camera.grab_view()) for a zero-copy view. This is faster, but the returned buffer can be overwritten by later captures.
To capture a region:
left, top = (1920 - 640) // 2, (1080 - 640) // 2
right, bottom = left + 640, top + 640
frame = camera.grab(region=(left, top, right, bottom)) # numpy.ndarray of size (640x640x3) -> (HXWXC)camera.start(region=(left, top, right, bottom), target_fps=60)
camera.is_capturing # True
# ...
camera.stop()
camera.is_capturing # Falsefor _ in range(1000):
frame = camera.get_latest_frame() # blocks until a frame is availableThe screen capture mode spins up a thread that polls newly rendered frames and stores them in an in-memory ring buffer. The blocking and
video_modebehavior is designed for downstream video recording and machine learning workloads.
Useful variants:
camera.get_latest_frame(with_timestamp=True)->(frame, frame_timestamp)-> return frame timestampcamera.get_latest_frame_view()-> zero-copy view into the frame buffercamera.grab(copy=False)/camera.grab_view()-> zero-copy latest-frame snapshot
When
start()capture is running, callinggrab()reads from the in-memory ring buffer instead of directly polling DXGI.
release() stops capture, frees buffers, and releases capture resources.
After release(), the same instance cannot be reused.
camera = dxcam.create(output_idx=0, output_color="BGR")
camera.release()
# camera.start() # raises RuntimeErrorEquivalently you can use context manager:
with dxcam.create() as camera:
frame = camera.grab()
# resource released automaticallyFull API Docs: https://ra1nty.github.io/DXcam/
cam1 = dxcam.create(device_idx=0, output_idx=0)
cam2 = dxcam.create(device_idx=0, output_idx=1)
cam3 = dxcam.create(device_idx=1, output_idx=1)
img1 = cam1.grab()
img2 = cam2.grab()
img3 = cam3.grab()Inspect available devices/outputs:
>>> import dxcam
>>> print(dxcam.device_info())
'Device[0]:<Device Name:NVIDIA GeForce RTX 3090 Dedicated VRAM:24348Mb VendorId:4318>\n'
>>> print(dxcam.output_info())
'Device[0] Output[0]: Res:(1920, 1080) Rot:0 Primary:True\nDevice[0] Output[1]: Res:(1920, 1080) Rot:0 Primary:False\n'Set output color mode when creating the camera:
dxcam.create(output_color="BGRA")Supported modes: "RGB", "RGBA", "BGR", "BGRA", "GRAY".
Notes:
- Data is returned as
numpy.ndarray. BGRAdoes not require OpenCV and is the leanest dependency path.RGB,BGR,RGBA,GRAYrequire conversion (cv2or compilednumpybackend).
DXcam uses a fixed-size ring buffer in-memory. New frames overwrite old frames when full.
camera = dxcam.create(max_buffer_len=120) # default is 8DXcam uses high-resolution pacing with drift correction to run near target_fps.
camera.start(target_fps=120) # default to 60, greater than 120 is resource heavyOn Python 3.11+, DXcam relies on Windows high-resolution timer behavior used by time.sleep().
On older versions, DXcam uses WinAPI waitable timers directly.
Read the most recent frame timestamp (seconds):
camera.start(target_fps=60)
frame, ts = camera.get_latest_frame(with_timestamp=True)
camera.stop()For backend="dxgi", this value comes from DXGI_OUTDUPL_FRAME_INFO.LastPresentTime.
For backend="winrt", this value is derived from WinRT SystemRelativeTime.
With video_mode=True, DXcam fills the buffer at target FPS, reusing the previous frame if needed, even if no new frame is rendered.
import cv2
import dxcam
target_fps = 30
camera = dxcam.create(output_color="BGR")
camera.start(target_fps=target_fps, video_mode=True)
writer = cv2.VideoWriter(
"video.mp4", cv2.VideoWriter_fourcc(*"mp4v"), target_fps, (1920, 1080)
)
for _ in range(600):
writer.write(camera.get_latest_frame())
camera.stop()
writer.release()DXcam supports two capture backends:
dxgi(default): Desktop Duplication API path with broad compatibility.winrt: Windows Graphics Capture path.
Use it like this:
camera = dxcam.create(backend="dxgi")
camera = dxcam.create(backend="winrt")Guideline:
- If you need cursor rendering, use
winrt. - Start with
dxgifor most workloads, especially one-shot grab. - Try
winrtif it performs better on your machine or fits your app constraints.
DXcam capture backends (dxgi/winrt) first acquire a BGRA frame.
The processor backend then handles post-processing:
- optional rotation/cropping preparation
- color conversion to your
output_color
Recommended backend choice:
- OpenCV installed: use
cv2(default) - No OpenCV installed: use
numpy(Cython kernels)
Use it like this:
camera = dxcam.create(processor_backend="cv2")
camera = dxcam.create(processor_backend="numpy")Official Windows wheels already include the compiled NumPy kernels.
Only for source installs:
set DXCAM_BUILD_CYTHON=1
pip install -e .[cython] --no-build-isolationIf processor_backend="numpy" is selected but compiled kernels are unavailable,
DXcam logs a warning and falls back to cv2 behavior. In that fallback path,
install OpenCV for non-BGRA output modes.
When using a similar logic (only capture newly rendered frames) running on a 240fps output, DXCam, python-mss, D3DShot benchmarked as follow:
| DXcam | python-mss | D3DShot | |
|---|---|---|---|
| Average FPS | 239.19 π | 75.87 | 118.36 |
| Std Dev | 1.25 | 0.5447 | 0.3224 |
The benchmark is across 5 runs, with a light-moderate usage on my PC (5900X + 3090; Chrome ~30tabs, VS Code opened, etc.), I used the Blur Buster UFO test to constantly render 240 fps on my monitor. DXcam captured almost every frame rendered. You will see some benchmarks online claiming 1000+fps capture while most of them is busy-spinning a for loop on a staled frame (no new frame rendered on screen in test scenario).
| (Target)\(mean,std) | DXcam | python-mss | D3DShot |
|---|---|---|---|
| 60fps | 61.71, 0.26 π | N/A | 47.11, 1.33 |
| 30fps | 30.08, 0.02 π | N/A | 21.24, 0.17 |
OBS Studio - implementation ideas and references.
D3DShot : DXcam borrowed some ctypes header from the no-longer maintained D3DShot.