Good bot support is part of what makes Discord so nice to use. Unfortunately, the official Zoom API is basically only useful for scheduling meetings and using Zoom Chat, not for making in-meeting bots. So I decided to bring this feature to Zoom myself.
Here is a demo of a basic, ~130 lines of code bot:
- Audio
- ✅ Listening WebSocket
- ✅ RTP decoding
- ❌ Audio packets decoding (unable to mux/format the audio packets into readable format)
- ✅ Publishing WebSocket
- Video
- ✅ Viewing WebSocket
- ✅ RTP decoding
- ✅ H264 decoding tested
- ✅ Recording of single participant (❌ but currently dumps video streams of all participants into single file, breaks)
- ❌ RTP encoding (see
ZoomRtpEncoder
inzoom/rtp.go
) - ✅ Publishing WebSocket
- Screenshare
- ✅ Viewing WebSocket
- ✅ RTP decoding (❌ RTP extension frame info not completely figured out
zoom/protocol/rtp_ext_frame_info.go
) - ❌ H264 decoding (seems like a custom H264 codec, unplayable by ffmpeg)
- ❌ H264 encoding (need to solve decoding first)
- ❌ RTP encoding (see
ZoomRtpEncoder
inzoom/rtp.go
) - ✅ Publishing WebSocket
I created this by reverse engineering the Zoom Web SDK. Regular web joins are captcha-gated but web SDK joins are not. I use an API only used by the Web SDK to get tokens needed to join the meeting. This means you need a Zoom API key/secret, specifically a "Meeting SDK" one. These can be obtained on the Zoom App Marketplace site: click Meeting SDK (Create) -> name app, disable publishing to marketplace -> fill descriptions and contact information with anything you want -> click App Credentials. The demos at examples/
reads these from the environment as ZOOM_API_KEY
and ZOOM_API_SECRET
.
Because the API keys are associated with your account, using this software may get your Zoom account banned (reverse engineering is against the Zoom Terms of Service). Please do not use this on an important account.
go get github.com/RealKeyboardWarrior/zoomer
cd $GOPATH/src/github.com/RealKeyboardWarrior/zoomer
scripts/build.sh
SDK Authentication
ZOOM_API_KEY="xxx" ZOOM_API_SECRET="xxx" ./zoomer -meetingNumber xxxxx -password xxxxx
JWT Authentication (deprecated)
ZOOM_API_TYPE="jwt" ZOOM_API_KEY="xxx" ZOOM_API_SECRET="xxx" ./zoomer -meetingNumber xxxxx -password xxxxx
Feel free to use the demo as a template. If you want to use the library elsewhere just import github.com/RealKeyboardWarrior/zoomer/pkg/zoom
.
See the comments in main.go
Feature | Send/recv | Message Name | Function (if send) / struct type (if recv) | Host Required | Tested |
---|---|---|---|---|---|
Send a chat message | Send | WS_CONF_CHAT_REQ | ZoomSession.SendChatMessage | No | Yes |
Pretend to "join audio" | Send | WS_AUDIO_VOIP_JOIN_CHANNEL_REQ | ZoomSession.JoinAudioVoipChannel | No | Yes |
Pretend to turn on/off video (if enabled camera indicator appears to be on but actually just shows a black screen) | Send | WS_VIDEO_MUTE_VIDEO_REQ | ZoomSession.SetVideoMuted | No | Yes |
Pretending to screen share (shows "x" is sharing their screen but is just a black screen) | Send | WS_CONF_SET_SHARE_STATUS_REQ | ZoomSession.SetScreenShareMuted | Depending on share settings | Yes |
Pretend to turn on/off audio (if enabled audio indicator appears to be on but no audio is actually outputted) | Send | WS_AUDIO_MUTE_REQ | ZoomSession.SetAudioMuted | No | Yes |
Rename self | Send | WS_CONF_RENAME_REQ | ZoomSession.RenameMe | Depending on settings | Yes |
Rename others | Send | WS_CONF_RENAME_REQ | ZoomSession.RenameById | Yes | No |
Request everyone mutes themselves | Send | WS_AUDIO_MUTEALL_REQ | ZoomSession.RequestAllMute | Yes | No |
Set mute upon entry status | Send | WS_CONF_SET_MUTE_UPON_ENTRY_REQ | ZoomSession.SetMuteUponEntry | Yes | No |
Set allow unmuting audio | Send | WS_CONF_ALLOW_UNMUTE_AUDIO_REQ | ZoomSesssion.SetAllowUnmuteAudio | Yes | No |
Set allow participant renaming | Send | WS_CONF_ALLOW_PARTICIPANT_RENAME_REQ | ZoomSession.SetAllowParticipantRename | Yes | No |
Set chat restrictions level | Send | WS_CONF_CHAT_PRIVILEDGE_REQ | ZoomSession.SetChatLevel | Yes | Yes |
Set screen sharing locked status | Send | WS_CONF_LOCK_SHARE_REQ | ZoomSession.SetShareLockedStatus | Yes | No |
End meeting | Send | WS_CONF_END_REQ | ZoomSession.EndMeeting | Yes | No |
Set allow unmuting video | Send | WS_CONF_ALLOW_UNMUTE_VIDEO_REQ | ZoomSession.SetAllowUnmuteVideo | Yes | No |
Request breakout room join token | Send | WS_CONF_BO_JOIN_REQ | ZoomSession.RequestBreakoutRoomJoinToken | No | Yes |
Breakout room broadcast | Send | WS_CONF_BO_BROADCAST_REQ | ZoomSession.BreakoutRoomBroadcast | Yes | No |
Request a token for creation of a breakout room | Send | WS_CONF_BO_TOKEN_BATCH_REQ | ZoomSession.RequestBreakoutRoomToken | Yes | Yes |
Create a breakout room | Send | WS_CONF_BO_START_REQ | ZoomSession.CreateBreakoutRoom | Yes | No |
Join information (user ID, participant ID and some other stuff) | Recv | WS_CONF_JOIN_RES | JoinConferenceResponse | Yes | |
Breakout room creation token response (response to WS_CONF_BO_TOKEN_BATCH_REQ) | Recv | WS_CONF_BO_TOKEN_RES | ConferenceBreakoutRoomTokenResponse | Yes | |
Breakout room join response | Recv | WS_CONF_BO_JOIN_RES | ConferenceBreakoutRoomJoinResponse | Yes | |
Permission to show avatars changed | Recv | WS_CONF_AVATAR_PERMISSION_CHANGED | ConferenceAvatarPermissionChanged | Yes | |
Roster change (mute/unmute, renames, leaves/joins) | Recv | WS_CONF_ROSTER_INDICATION | ConferenceRosterIndication | Yes | |
Meeting attribute setting (stuff like "is sharing allowed" or "is the meeting locked") | Recv | WS_CONF_ATTRIBUTE_INDICATION | ConferenceAttributeIndication | Yes | |
Host change | Recv | WS_CONF_HOST_CHANGE_INDICATION | ConferenceHostChangeIndication | Yes | |
Cohost change | Recv | WS_CONF_COHOST_CHANGE_INDICATION | ConferenceCohostChangeIndication | Yes | |
"Hold" state (waiting rooms) | Recv | WS_CONF_HOLD_CHANGE_INDICATION | ConferenceHoldChangeIndication | Yes | |
Chat message | Recv | WS_CONF_CHAT_INDICATION | ConferenceChatIndication | Yes | |
Meeting "option" parameter (used for waiting room and breakout rooms) | Recv | WS_CONF_OPTION_INDICATION | ConferenceOptionIndication | Yes | |
??? Local Record Indication ??? | Recv | WS_CONF_LOCAL_RECORD_INDICATION | ConferenceLocalRecordIndication | Yes | |
Breakout room command (forcing you to join a room, broadcasts) | Recv | WS_CONF_BO_COMMAND_INDICATION | ConferenceBreakoutRoomCommandIndication | Yes | |
Breakout room attributes (settings and list of rooms) | Recv | WS_CONF_BO_ATTRIBUTE_INDICATION | ConferenceBreakoutRoomAttributeIndication | Yes | |
Datacenter Region | Recv | WS_CONF_DC_REGION_INDICATION | ConferenceDCRegionIndication | Yes | |
??? Audio Asn ??? | Recv | WS_AUDIO_ASN_INDICATION | AudioAsnIndication | Yes | |
??? Audio Ssrc ??? | Recv | WS_AUDIO_SSRC_INDICATION | AudioSSRCIndication | Yes | |
Someone has enabled video | Recv | WS_VIDEO_ACTIVE_INDICATION | VideoActiveIndication | Yes | |
??? Video Ssrc ??? | Recv | WS_VIDEO_SSRC_INDICATION | SSRCIndication | Yes | |
Someone is sharing their screen | Recv | WS_SHARING_STATUS_INDICATION | SharingStatusIndication | Yes |
Note that you are free to construct your own message types for any I have not implemented.
For sending: Look at zoom/requests.go
and switch out the struct and message type names for your new message type
For receiving: Create a definition for the type and update the getPointerForBody function in zoom/message.go.
The protocol used by the Zoom Web client is basically just JSON over Websockets. The messages look something like this:
{"body":{"bCanUnmuteVideo":true},"evt":7938,"seq":44}
{"body":{"add":null,"remove":null,"update":[{"audio":"","bAudioUnencrytped":false,"id":16785408}]},"evt":7937,"seq":47}
{"body":{"add":null,"remove":null,"update":[{"caps":5,"id":16785408,"muted":true}]},"evt":7937,"seq":63}
{"body":{"dc":"the United States(SC)","network":"Zoom Global Network","region":"the United States"},"evt":7954,"seq":3}
The "evt" number specifies the event number. There is a (mostly complete) list of these in zoom/constant.go
that I extracted from javascript code on the meeting page.
For the above three messages, the types are:
WS_CONF_ATTRIBUTE_INDICATION = 7938 // ConferenceAttributeIndication
WS_CONF_ROSTER_INDICATION = 7937 // ConferenceRosterIndication
WS_CONF_DC_REGION_INDICATION = 7954 // ConferenceDCRegionIndication
The thing in the comments to the right is the struct type for that message, which can be found in zoom/message_types.go
.
Also, the server and client both have sequence numbers ("seq") for the messages they send but it doesn't appear to be used for anything (?).
- Gracefully exit/disconnect
- Organize
zoom/message_types.go
and general refactoring - Support for meetings where you don't have the password but just a Zoom url with the "pwd" parameter in it (anyone know anything about this??)
- Thoroughly test things
- Make it more extensible
- Joining breakout room support
- More comments and documentation
- Support audio/video - partial support for decrypting screen share is added.
This is hobbyist software that has no guarantees of being maintained or supported. Please don't use it anywhere near production.