Audio Tracking That Doesn’t Rely on Your Speakerphone
Most dual-lens PTZ cameras use the speakerphone’s audio signal to determine who is speaking and where to point the camera. The CAM570 doesn’t. Its built-in audio sensor array — three pairs of four sensors each, covering ±60° up to 10 metres — tracks the active speaker independently of any external audio device. The critical practical implication: the speakerphone can be positioned away from the camera without degrading tracking accuracy. If the FONE540 or any other speakerphone is at one end of the table and the camera is mounted on the wall, the camera still tracks correctly based on its own sensor, not the speakerphone’s audio feed.
The secondary AI lens runs simultaneously with the PTZ and serves two purposes: real-time motion detection for tracking, and newcomer detection. When someone new enters the room, the AI lens spots them immediately and triggers the PTZ to reframe the group shot to include them — automatically, without operator input. Smart Gallery then crops each detected participant into equal-sized individual feeds using AI face and body detection. Remote attendees see everyone at the same scale. Choose headshot or half-body mode depending on the meeting format.
Smart Composition adds the participant framing capability found in the CAM520 Pro3: AI automatically crops up to 9 participants into perfectly framed individual feeds — headshot or half-body — without any software installation or calibration. It’s a click-to-enable feature in PTZApp 2. Gesture control allows touchless operation of tracking and zoom, keeping touch points to a minimum in shared meeting spaces. Picture-in-picture combines the PTZ speaker close-up and the AI wide-angle room view into a single output with multiple layout options.
The 36X total zoom (12X optical, 3X digital) covers a 90° DFOV wide shot down to tight presenter detail. Pan ±170°, tilt +90°/-30°, 128 presets via VISCA. PoE+ powers the camera over the network cable. HDMI, USB 3.1, and IP outputs are available. The optional FONE540 speakerphone connects via the Audio In port. Three-year camera warranty, one year on accessories.
Built-In Audio Sensor — Tracks the Speaker Without the Speakerphone
The CAM570’s built-in sensor array (3 pairs × 4 sensors, ±60°, up to 10 m) identifies and tracks the active speaker independently — without using the connected speakerphone’s audio signal. The speakerphone can be positioned wherever it serves the room best; the camera tracks based on its own sensor regardless. Presentation Mode lets you set a preset point to focus on a specific area, and the camera shifts there automatically when a voice is detected in that zone.
Key Features
- Built-in audio tracking sensor — independent of speakerphone — 3 pairs × 4 sensors; ±60°; up to 10 m range; tracks the active speaker without relying on the speakerphone’s audio signal
- 36X total zoom with dual 4K lenses — 12X optical PTZ at 90° DFOV and secondary AI wide-angle lens run simultaneously; pan ±170°, tilt +90°/-30°; 128 presets
- Dynamic newcomer detection — AI secondary lens detects arrivals in real time; PTZ automatically reframes without operator input
- Smart Gallery — AI face and body detection crops participants into equal headshot or half-body feeds; configured via PTZApp 2
- Smart Composition — up to 9 participants, click to enable — AI frames each participant perfectly; no installation, no calibration; works immediately via PTZApp 2
- Touchless gesture control — toggle tracking and control zoom without physical contact; reduces touch points in shared spaces
- Picture-in-picture output — combines PTZ speaker view and AI wide-angle view in one output; multiple layout options for large room presentations
- PoE+ powered + FONE540 compatible — power over network cable; optional FONE540 speakerphone via Audio In port
Dynamic Detection — The Room View Updates When Someone Walks In
The dual-lens design monitors the room continuously. When the AI secondary lens detects a newcomer entering, it immediately triggers the PTZ to reframe the group shot to include them. The process is real-time and fully automatic — no operator, no controller, no pause in the meeting. For training rooms and boardrooms where late arrivals are common, this keeps the session running without interruption.
Smart Gallery — Everyone Equally Visible, Automatically
Smart Gallery uses AI face and body detection to crop each participant into an equal-sized individual feed. Remote attendees see everyone at the same scale — not a wide shot where the person at the far end of the table is a fraction of the size of the person nearest the camera. Choose headshot or half-body mode and configure it once in PTZApp 2; it runs automatically from that point on.
Gesture Control — Touchless Camera Management Mid-Meeting
The CAM570 recognises hand gestures to toggle tracking on and off and control zoom — no remote, keyboard, or touchscreen required. In shared meeting spaces where reducing touch points matters, or simply when the presenter doesn’t want to break their flow to operate a controller, gesture control removes the friction completely.
Picture-in-Picture — Speaker Close-Up and Room View in One Output
The dual-lens design enables picture-in-picture output: the PTZ speaker close-up and the AI wide-angle room overview are combined into a single video output with multiple layout options. Remote participants get both the detail and the context simultaneously — particularly useful in training sessions and presentations where understanding who is responding and what the room looks like both matter.
Smart Composition — Enterprise Participant Framing, Click to Enable
Smart Composition automatically crops up to 9 participants into perfectly framed individual feeds — headshot or half-body — using AI face and body detection. No sensors, no calibration, no software installation required. Click to enable in PTZApp 2 and the CAM570 handles participant framing for every meeting. The same technology usually found only in enterprise-grade multi-camera systems, built into this single PTZ unit.
Specifications
| Category | Specification | What It Means for You |
|---|---|---|
| PTZ Lens | 1/2.8″ 4K Sony Exmor CMOS; 12X optical, 3X digital (36X total); DFOV 90° | 36X total zoom covers mid-to-large rooms from wide shot to tight close-up |
| AI Secondary Lens | 1/2.5″ 4K Sony Exmor CMOS; wide FOV; frame rates match PTZ settings | Wide AI lens provides room overview, motion detection, and newcomer detection simultaneously |
| Frame Rates | 4K/30fps; 1080p/60fps; 720p/60fps; multiple lower resolutions | Smooth 4K and Full HD for all outputs simultaneously |
| Pan / Tilt | Pan ±170°; Tilt +90° (up) / −30° (down); 10 via IR, 128 via RS-232 | Full room coverage with 128 precise preset positions |
| Audio Tracking Sensor | Built-in: 3 pairs × 4 sensors; ±60° pickup; up to 10 m range | Tracks speaker independently — speakerphone position doesn’t affect tracking accuracy |
| Smart Composition | AI face/body crop; up to 9 participants; headshot and half-body; click to enable | Equal individual feeds without installation; activate in PTZApp 2 |
| Smart Gallery | AI face and body detection; headshot and half-body modes | Crops participants into equal-sized feeds for remote attendees |
| Dynamic Detection | AI secondary lens detects newcomers; PTZ auto-reframes in real time | Camera keeps the group shot accurate when people arrive late |
| Gesture Control | Touchless: tracking toggle, zoom in/out | Control the camera mid-meeting without touching any device |
| Picture-in-Picture | Multiple layout options combining PTZ and AI lens views | Speaker detail and room context in one output for complex presentations |
| USB | USB 3.1 Type-B; backward compatible with USB 2.0; UVC plug-and-play | Single cable to PC; no drivers; works on Windows, macOS, Chromebook |
| HDMI Out | 1× HDMI | Display camera or PIP output on the room screen |
| Audio In | Audio In × 1 (1Vrms); connect FONE540 speakerphone via Line-out | Add the FONE540 for full conferencing audio; sensor tracks independently |
| Network | RJ-45; RTSP, RTMP; H.264 | Remote management via PTZApp 2 and WebUI; IP streaming for recording |
| Control | IR remote, VISCA/Pelco-P/Pelco-D via RS-232, VISCA over IP, PTZApp 2, WebUI | Full integration with Crestron, Q-SYS, Extron, and all major AV control systems |
| Power | AC 100–240 V, 12 V/2 A; PoE+ (802.3at) | PoE+ eliminates separate power cable for ceiling or high-wall installations |
| Dimensions | 170.8 × 190.5 × 173 mm, 2.1 kg | Compact PTZ; wall bracket and tripod screw included |
| System Requirements | Windows 7/10/11; macOS 10.14+; Chromebox v94+ | Broad OS compatibility; no custom drivers required |
| Compatible Apps | Zoom, Teams, Google Meet, RingCentral, Webex, BlueJeans, and more | Works with all major platforms without additional configuration |
| Optional Accessories | FONE540 speakerphone, ceiling mount, foldable TV mount, USB extender, PoE+ injector | Expand audio and installation options as the room requires |
| Warranty | Camera: 3 years | Accessories: 1 year | AVer hardware warranty; Australian stock via Kickstart Computers |
Warranty & Support
The AVer CAM570 is covered by a 3-year limited hardware warranty on the camera unit and a 1-year warranty on accessories. Australian stock.
Kickstart Computers provides pre-sales consultation and post-sales support for CAM570 deployments in corporate boardrooms, training facilities, and government environments across Australia.
Explore the full AVer camera and video bar range — conferencing systems, PTZ cameras, and collaboration solutions available at Kickstart Computers for every room size and budget.
Browse our complete video conferencing hub to compare cameras, video bars, and room systems across all major brands, configurations, and price points.
Browse Video Conferencing Equipment






















