Camera and Speakerphone — Matched and Ready as One System
The VC520 Pro3 ships as a complete unit: a 4K PTZ camera and a matched speakerphone designed to work together from the moment they’re unboxed. For IT managers equipping mid-to-large conference rooms, the matched-system approach means audio and video are tuned as a pair, setup takes minutes, and the single USB cable connection keeps the meeting space clean and uncluttered.
The camera brings the same Smart Composition capability found on the standalone CAM520 Pro3 — AI that crops up to 9 participants into equal-sized individual feeds automatically, with no sensors, calibration, or installation required. The 36X total zoom (12X optical, with 24X lossless) covers an 80.5° diagonal field of view down to tight presenter detail. WDR processing keeps the image balanced in backlit and high-contrast environments. SmartFrame AI auto-framing adjusts continuously to keep all participants in shot, including late arrivals.
The speakerphone is where the system earns its price over the standalone camera. It connects audio and video to the PC through a single USB cable — no separate audio interface, no driver installation, no multi-cable management. For rooms that grow, the speakerphone supports daisy-chain expansion: add up to two additional expansion speakerphones or a full-duplex microphone set to extend coverage across larger or irregular room layouts without any additional processing hardware or mixer.
HDMI, USB, and IP outputs all run simultaneously at 1080p/60fps — the same triple-stream capability as the CAM520 Pro3. VISCA, Pelco-P, and Pelco-D control via RS-232 and IP for integration with room control systems. Certified for Zoom, Teams, and Google Meet. Three-year warranty on both camera and speakerphone.
36X Total Zoom with WDR — Sharp and Balanced in Any Room

The VC520 Pro3’s 12X optical zoom extends to 24X lossless zoom for outstanding clarity and 36X total via PTZApp 2, covering an 80.5° diagonal FOV down to tight presenter detail. WDR processing keeps the image balanced in backlit rooms — participants in front of windows remain clearly visible without manual exposure adjustment.
Key Features
- Smart Composition — up to 9 participants, no setup — AI crops each participant into equal headshot or half-body feeds automatically; no software installation, calibration, or sensors required
- 36X total zoom with 24X lossless zoom — 12X optical; 24X lossless for outstanding clarity; 80.5° diagonal FOV for mid-to-large rooms
- Triple simultaneous output at 1080p/60fps — HDMI, USB 3.1, and IP all active at once; display, video call, and recording running simultaneously
- Single USB cable — video and audio to the PC — one USB connection handles both; no separate audio interface, no extra cables, no driver installation
- Speakerphone with daisy-chain expansion — add up to two expansion speakerphones or a full-duplex microphone set to extend audio coverage without additional hardware
- WDR Sony sensor — wide dynamic range keeps participants clearly visible in backlit or high-contrast environments
- VISCA/Pelco control via RS-232 and IP — integrates with Crestron, Q-SYS, Extron; 128 preset positions
- 3-year warranty on camera and speakerphone — both units covered; Australian stock through Kickstart Computers
Smart Composition — Enterprise Participant Framing, No Installation

Smart Composition uses AI to detect and crop up to 9 participants into equal-sized individual feeds — headshot or half-body — automatically. No sensors, no calibration, no software to install. It’s the same gallery framing technology usually found only in enterprise-grade multi-camera systems, built directly into a single PTZ camera and speakerphone package at this price point.
Triple Simultaneous Output — Display, Call, and Recording at Once

HDMI, USB 3.1, and IP outputs all run simultaneously at 1080p/60fps — one VC520 Pro3 feeds the room display, the conferencing PC, and a recording or streaming encoder at the same time, with no switching hardware and no quality compromise across any of the three outputs.
One USB Cable — Video and Audio to the PC, Nothing Else

Plug a single USB cable into the PC and the VC520 Pro3 is ready — video from the camera and audio from the speakerphone both run through that one connection. No separate audio interface, no driver installation, no multi-cable management behind the display. Setup takes minutes and the result is a clean, uncluttered meeting space.
Expandable Audio — Daisy-Chain Coverage Without Extra Hardware

The included speakerphone expands via daisy-chain — connect up to two additional expansion speakerphones or a full-duplex microphone set to extend audio coverage across larger or irregularly shaped rooms. No additional processing hardware, no mixer, no extra configuration required. The system scales with the room.
Specifications
| Category | Specification | What It Means for You |
|---|---|---|
| Image Sensor | 1/2.8″ 4K Sony Exmor CMOS | Sony WDR sensor for accurate colour and backlight compensation |
| Resolution | 1080p up to 60fps; 720p up to 60fps; multiple lower resolutions at 30/15fps | Smooth Full HD across all three simultaneous outputs |
| Zoom | 12X optical; 24X lossless; 36X total (via PTZApp 2) | 24X lossless zoom for maximum clarity without digital degradation |
| Field of View | 80.5°/72.8°/44.1° (D/H/V) | Wide enough for mid-to-large rooms; narrows to tight presenter detail |
| Smart Composition | AI face/body crop; up to 9 participants; headshot and half-body modes | Equal individual feeds — click to enable, no sensors or calibration needed |
| SmartFrame | AI auto-framing; adjusts continuously for late arrivals | Keeps all participants in frame throughout the meeting automatically |
| Triple Output | HDMI, USB 3.1, IP — all simultaneous at 1080p/60fps | Display, video call, and recording all running at the same time |
| Single USB Setup | One USB cable carries video and audio to the PC | No separate audio interface; fast setup; clean meeting space |
| Speakerphone Expansion | Daisy-chain: up to 2× expansion speakerphones or full-duplex microphone set | Scale audio coverage for larger rooms without additional processing hardware |
| Audio Format | AAC-LC; RTSP, RTMP | Standard audio codec; compatible with all major conferencing platforms |
| Video Format | YUV, YUY2, MJPEG; H.264, H.265; RTSP, RTMP | Compatible with all major conferencing and recording platforms |
| Control | IR remote, VISCA/Pelco-P/Pelco-D via RS-232, VISCA over IP, PTZApp 2, WebUI | 128 presets; integrates with Crestron, Q-SYS, Extron |
| Power | AC 100–240 V, 12 V/5 A | Standard power adapter; suitable for wall, shelf, or cart installation |
| Camera Dimensions | 182 × 142.7 × 153 mm, 1.47 kg | Compact PTZ; compatible with standard wall, ceiling, and TV mounts |
| Speakerphone Dimensions | 220 × 181.5 × 49.5 mm, 0.85 kg | Low-profile; sits centrally on the conference table |
| System Requirements | Windows 7/10/11; macOS 10.14+; Chromebox v94+ | Broad OS compatibility; UVC/UAC plug-and-play |
| Compatible Apps | Zoom, Teams, Google Meet, RingCentral, Webex, BlueJeans, and more | Certified for Zoom, Teams, and Google Meet |
| Warranty | Camera and Speakerphone: 3 years | Accessories: 1 year | Both units covered by AVer’s hardware warranty; Australian stock |
Warranty & Support
The AVer VC520 Pro3 system — including the camera and speakerphone — is covered by a 3-year limited hardware warranty on both units and a 1-year warranty on accessories. Australian stock.
Kickstart Computers provides pre-sales consultation and post-sales support for VC520 Pro3 deployments in corporate boardrooms, training rooms, and government environments across Australia.
Explore the full AVer camera and video bar range — conferencing systems, PTZ cameras, and collaboration solutions available at Kickstart Computers for every room size and budget.
Browse our complete video conferencing hub to compare cameras, video bars, and room systems across all major brands, configurations, and price points.
Browse Video Conferencing Equipment













