An overview of the PPPP protocol for IoT cameras
文章分析了PPPP协议的设计与实现,该协议用于物联网设备通信,尤其在网络摄像头中广泛应用。尽管宣称是P2P协议,但其实依赖中央服务器,并支持多种网络端口和设备ID结构。文章还探讨了不同协议变种(如CS2、Yi、iLnk)及其差异,并揭示了其加密机制和实际应用情况。 2025-11-5 15:20:59 Author: palant.info(查看原文) 阅读量:0 收藏

My previous article on IoT “P2P” cameras couldn’t go into much detail on the PPPP protocol. However, there is already lots of security research on and around that protocol, and I have a feeling that there is way more to come. There are pieces of information on the protocol scattered throughout the web, yet every one approaching from a very specific narrow angle. This is my attempt at creating an overview so that other people don’t need to start from scratch.

While the protocol can in principle be used by any kind of device, so far I’ve only seen network-connected cameras. It isn’t really peer-to-peer as advertised but rather relies on central servers, yet the protocol allows to transfer the bulk of data via a direct connection between the client and the device. It’s hard to tell how many users there are but there are lots of apps, I’m sure that I haven’t found all of them.

There are other protocols with similar approaches being used for the same goal. One is used by ThroughTek’s Kalay Platform which has the interesting string “Charlie is the designer of P2P!!” in its codebase (32 bytes long, seems to be used as “encryption” key for some non-critical functionality). I recognize both the name and the “handwriting,” it looks like PPPP protocol designer found a new home here. Yet PPPP seems to be still more popular than the competition, thanks to it being the protocol of choice for cheap low-end cameras.

Disclaimer: Most of the information below has been acquired by analyzing public information as well as reverse engineering applications and firmware, not by observing live systems. Consequently, there can be misinterpretations.

The general design

The protocol’s goal is to serve as a drop-in replacement for TCP. Rather than establish a connection to a known IP address (or a name to be resolved via DNS), clients connect to a device identifier. The abstraction is supposed to hide away how the device is located (via a server that keeps track of its IP address), how a direct communication channel is established (via UDP hole punching) or when one of multiple possible fallback scenarios is being used because direct communication is not possible.

The protocol is meant to be resilient, so there are usually three redundant servers handling each network. When a device or client needs to contact a server, it sends the same message to all of them and doesn’t care which one will reply. Note: In this article “network” generally means a PPPP network, i.e. a set of servers and the devices connecting to them. While client applications typically support multiple networks, devices are always associated with a specific one determined by their device prefix.

For what is meant to be a transport layer protocol, PPPP has some serious complexity issues. It encompasses device discovery on the LAN via UDP broadcasts, UDP communication between device/client and the server and a number of (not exactly trivial) fallback solutions. It also features multiple “encryption” algorithms which are more correctly described as obfuscators and network management functionality.

Paul Marrapese’s Wireshark Dissector provides an overview of the messages used by the protocol. While it isn’t quite complete, a look into the pppp.fdesc file shows roughly 70 different message types. It’s hard to tell how all these messages play together as the protocol has not been designed as a state machine. The protocol implementation uses its previous actions as context to interpret incoming messages, but it has little indication as to which messages are expected when. Observing a running system is essential to understanding this protocol.

The complicated message exchange required to establish a connection between a device and a client has been described by Elastic Security Labs. They also provide the code of their client which implements that secret handshake.

I haven’t seen any descriptions of how the fallback approaches work when a direct connection cannot be established. Neither could I observe these fallbacks in action, presumably because the network I observed didn’t enable them. There are at least three such fallbacks: UDP traffic can be relayed by a network-provided server, it can be relayed by a “supernode” which is a device that agreed to be used as a relay, and it can be wrapped in a TCP connection to the server. The two centralized solutions incur significant costs for the network owners, rendering them unpopular. And I can imagine the “supernode” approach to be less than reliable with low-end devices like these cameras (it’s also a privacy hazard but this clearly isn’t a consideration).

I recommend going though the CS2 sales presentation to get an idea of how the protocol is meant to work. Needless to say that it doesn’t always work as intended.

The network ports

I could identify the following network ports being used:

  • UDP 32108: broadcast to discover local devices
  • UDP 32100: device/client communication to the server
  • TCP 443: client communication to the server as fallback

Note that while port 443 is normally associated with HTTPS, here it was apparently only chosen to fool firewalls. The traffic is merely obfuscated, not really encrypted.

The direct communication between the client and the device uses a random UDP port. In my understanding the ports are also randomized when this communication is relayed by a server or supernode.

The device IDs

The canonical representation of a device ID looks like this: ABC-123456-VWXYZ. Here ABC is a device prefix. While a PPPP network will often handle more than one device prefix, mapping a device prefix to a set of servers is supposed to be unambiguous. This rule isn’t enforced across different protocol variants however, e.g. the device prefix EEEE is assigned differently by CS2 and iLnk.

The six digit number following the device prefix allows distinguishing different devices within a prefix. It seems that vendors can choose these numbers freely – some will assign them to devices sequentially, others go by some more complicated rules. A comment on my previous article even claims that they will sometimes reassign existing device IDs to new devices.

The final part is the verification code, meant to prevent enumeration of devices. It is generated by some secret algorithm and allows distinguishing valid device IDs from invalid ones. At least one such algorithm got leaked in the past.

Depending on the application a device ID will not always be displayed in its canonical form. It’s pretty typical for the dashes to be removed for example, in one case I saw the prefix being shortened to one letter. Finally, there are applications that will hide the device ID from the user altogether, displaying only some vendor-specific ID instead.

The protocol variants

So far I could identify at least four variants of this protocol – if you count HLP2P which is questionable. These protocol implementations differ significantly and aren’t really compatible. A number of apps can work with different protocol implementations but they generally do it by embedding multiple client libraries.

Variant Typical client library names Typical functions
CS2 Network libPPCS_API.so libobject_jni.so librtapi.so PPPP_Initialize PPPP_ConnectByServer
Yi Technology PPPP_API.so libmiio_PPPP_API.so PPPP_Initialize PPPP_ConnectByServer
iLnk libvdp.so libHiChipP2P.so XQP2P_Initialize XQP2P_ConnectByServer HI_XQ_P2P_Init
HLP2P libobject_jni.so libOKSMARTPPCS.so HLP2P_Initialize HLP2P_ConnectByServer

CS2 Network

The Chinese company CS2 Network is the original developer of the protocol. Their implementation can sometimes be recognized without even looking at any code just by their device IDs. The letters A, I, O and Q are never present in the verification code, there are only 22 valid letters here. Same seems to apply to the Yi Technology fork however which is generally very similar.

The other giveaway is the “init string” which encodes network parameters. Typically these init strings are hardcoded in the application (sometimes hundreds of them) and chosen based on device prefix, though some applications retrieve them from their servers. These init strings are obfuscated, with the function PPPP_DecodeString doing the decoding. The approach is typical for CS2 Network: a lookup table filled with random values and some random algebraic operations to make things seem more complex. The init strings look like this:

DRFTEOBOJWHSFQHQEVGNDQEXFRLZGKLUGSDUAIBXBOIULLKRDNAJDNOZHNKMJO:SECRETKEY

The part before the colon decodes into:

127.0.0.1,192.168.1.1,10.0.0.1,

This is a typical list of three server IPs. No, the trailing comma isn’t a typo but required for correct parsing. Host names are occasionally used in init strings but this is uncommon. With CS2 Network generally distrusting DNS from the looks of it, they probably recommend vendors to sidestep it. The “secret” key behind the colon is optional and activates encryption of transferred data which is better described as obfuscation. Unlike the server addresses, this part isn’t obfuscated.

Yi Technology

The Xiaomi spinoff Yi Technology appears to have licensed the code of the CS2 Network implementation. They made some moderate changes to it but it is still very similar to the original. For example, they still use the same code to decode init strings, merely with a different lookup table. Consequently, same init string as above would look slightly differently here:

LZERHWKWHUEQKOFUOREPNWERHLDLDYFSGUFOJXIXJMASBXANOTHRAFMXNXBSAM:SECRETKEY

As can be seen from Paul Marrapese’s Wireshark Dissector, the Yi Technology fork added a bunch of custom protocol messages and extended two messages presumably to provide forward compatibility. The latter is a rather unusual step for the PPPP ecosystem where the dominant approach seems to be “devices and clients connecting to the same network always use the same version of the client library which is frozen for all eternity.”

There is another notable difference: this PPPP implementation doesn’t contain any encryption functionality. There seems to be some AES encryption being performed at the application layer (which is the proper way to do it), I didn’t look too closely however.

iLnk

The protocol fork developed by Shenzhen Yunni Technology iLnkP2P seems to have been developed from scratch. The device IDs for legacy iLnk networks are easy to recognize because their verification codes only consist of the letters A to F. The algorithm generating these verification codes is public knowledge (CVE-2019-11219) so we know that these are letters taken from an MD5 hex digest. New iLnk networks appear to have verification codes that can contain all Latin letters, some new algorithm replaced the compromised one here. Maybe they use Base64 digests now?

An iLnk init string can be recognized by the presence of a dash:

ATBBARASAXAOAQAOAQAOARBBARAZASAOARAWAYAOARAOARBBARAQAOAQAOAQAOAR-$$

The part before the dash decodes into:

3;127.0.0.1;192.168.1.1;10.0.0.1

Yes, the first list entry has to specify how many server IPs there are. The decoding approach (function HI_DecStr or XqStrDec depending on the implementation) is much simpler here, it’s a kind of Base26 encoding. The part after the dash can encode additional parameters related to validation of device IDs but typically it will be $$ indicating that it is omitted and network-specific device ID validation can be skipped. As far as I can tell, iLnk networks will always send all data as plain text, there is no encryption functionality of any kind.

Going through the code, the network-level changes in the iLnk fork are extensive, with only the most basic messages shared with the original PPPP protocol. Some message types are clashing like for example MSG_DEV_MAX that uses the same type as MSG_DEV_LGN_CRC in the CS2 implementation. This fork also introduces new magic numbers: while PPPP messages normally start with 0xF1, some messages here start with 0xA1 and one for some reason with 0xF2.

Unfortunately, I haven’t seen any comprehensive analysis of this protocol variant yet, so I’ll just list the message types along with their payload sizes. For messages with 20 bytes payloads it can be assumed that the payload is a device ID. Don’t ask me why two pairs of messages share the same message type.

Message Message type Payload size
MSG_HELLO F1 00 0
MSG_RLY_PKT F1 03 0
MSG_DEV_LGN F1 10 IPv4: 40
IPv6: 152
MSG_DEV_MAX F1 12 20
MSG_P2P_REQ F1 20 IPv4: 36
IPv6: 152
MSG_LAN_SEARCH F1 30 0
MSG_LAN_SEARCH_EXT F1 32 0
MSG_LAN_SEARCH_EXT_ACK F1 33 52
MSG_DEV_UNREACH F1 35 20
MSG_PUNCH_PKT F1 41 20
MSG_P2P_RDY F1 42 20
MSG_RS_LGN F1 60 28
MSG_RS_LGN_EX F1 62 44
MSG_LST_REQ F1 67 20
MSG_RLY_HELLO F1 70 0
MSG_RLY_HELLO_ACK F1 71 0
MSG_RLY_PORT F1 72 0
MSG_RLY_PORT_ACK F1 73 8
MSG_RLY_PORT_EX_ACK F1 76 264
MSG_RLY_REQ_EX F1 77 288
MSG_RLY_REQ F1 80 IPv4: 40
IPv6: 160
MSG_HELLO_TO_ACK F1 83 28
MSG_RLY_RDY F1 84 20
MSG_SDEV_LGN F1 91 20
MSG_MGM_ADMIN F1 A0 160
MSG_MGM_DEVLIST_CTRL F1 A2 20
MSG_MGM_HELLO F1 A4 4
MSG_MGM_MULTI_DEV_CTRL F1 A6 variable
MSG_MGM_DEV_DETAIL F1 A8 24
MSG_MGM_DEV_VIEW F1 AA 4
MSG_MGM_RLY_LIST F1 AC 12
MSG_MGM_DEV_CTRL F1 AE 24
MSG_MGM_MEM_DB F1 B0 264
MSG_MGM_RLY_DETAIL F1 B2 24
MSG_MGM_ADMIN_LGOUT F1 BA 4
MSG_MGM_ADMIN_CHG F1 BC 164
MSG_VGW_LGN F1 C0 24
MSG_VGW_LGN_EX F1 C0 24
MSG_VGW_REQ F1 C3 20
MSG_VGW_REQ_ACK F1 C4 4
MSG_VGW_HELLO F1 C5 0
MSG_VGW_LST_REQ F1 C6 20
MSG_DRW F1 D0 variable
MSG_DRW_ACK F1 D1 variable
MSG_P2P_ALIVE F1 E0 0
MSG_P2P_ALIVE_ACK F1 E1 0
MSG_CLOSE F1 F0 0
MSG_MGM_DEV_LGN_DETAIL_DUMP F1 F4 12
MSG_MGM_DEV_LGN_DUMP F1 F4 12
MSG_MGM_LOG_CTRL F1 F7 12
MSG_SVR_REQ F2 10 0
MSG_DEV_LV_HB A1 00 20
MSG_DEV_SLP_HB A1 01 20
MSG_DEV_QUERY A1 02 20
MSG_DEV_WK_UP_REQ A1 04 20
MSG_DEV_WK_UP A1 06 20

HLP2P

While I’ve seen a few of apps with HLP2P code and the corresponding init strings, I am not sure whether these are still used or merely leftovers from some past adventure. All these apps use primarily networks that rely on other protocol implementations.

HLP2P init strings contain a dash which follows merely three letters. These three letters are ignored and I am unsure about their significance as I’ve only seen one variant:

DAS-0123456789ABCDEF

The decoding function is called from HLP2P_Initialize function and uses the most elaborate approach of all. The hex-encoded part after the dash is decrypted using AES-CBC where the key and initialization vector are derived from a zero-filled buffer via some bogus MD5 hashing. The decoded result is a list of comma-separated parameters like:

DCDC07FF,das,10000001,a+a+a,127.0.0.1-192.168.1.1-10.0.0.1,ABC-CBA

The fifth parameter is a list of server IP addresses and the sixth appears to be the list of supported device prefixes.

On the network level HLP2P is an oddity here. Despite trying hard to provide the same API as other PPPP implementations, including concepts like init strings and device IDs, it appears to be a TCP-based protocol (connecting to server’s port 65527) with little resemblance to PPPP. UDP appears to be used for local broadcasts only (on port 65531). I didn’t spend too much time on the analysis however.

“Encryption”

The CS2 implementation of the protocol is the only one that bothers with encrypting data, though their approach is better described as obfuscation. When encryption is enabled, the function P2P_Proprietary_Encrypt is applied to all outgoing and the function P2P_Proprietary_Decrypt to all incoming messages. These functions take the encryption key (which is visible in the application code as an unobfuscated part of the init string) and mash it into four bytes. These four bytes are then used to select values from a static table that the bytes of the message should be XOR’ed with.

There is at least one public implementation of this “encryption” though this one chose to skip the “key mashing” part and simply took the resulting four bytes as its key. A number of articles mention having implemented this algorithm however, it’s not really complicated.

The same obfuscation is used unconditionally for TCP traffic (TCP communication on port 443 as fallback). Here each message header contains two random bytes. The hex representation of these bytes is used as key to obfuscate message contents.

All *_CRC messages like MSG_DEV_LGN_CRC have an additional layer of obfuscation, performed by the functions PPPP_CRCEnc and PPPP_CRCDec. Unlike P2P_Proprietary_Encrypt which is applied to the entire message including the header, PPPP_CRCEnc is only applied to the payload. As normally only messages exchanged between the device and the server are obfuscated in this way, the corresponding key tends to be contained only in the device firmware and not in the application. Here as well the key is mashed into four bytes which are then used to generate a byte sequence that the message (extended by four + signs) is XOR’ed with. This is effectively an XOR cipher with a static key which is easy to crack even without knowing the key.

“Secret” messages

The CS2 implementation of the protocol contains a curiosity: two messages starting with 338DB900E559 being processed in a special way. No, this isn’t a hexadecimal representation of the bytes – it’s literally the message contents. No magic bytes, no encryption, the messages are expected to be 17 bytes long and are treated as zero-terminated strings.

I tried sending 338DB900E5592B32 (with a trailing zero byte) to a PPPP server and, surprisingly, received a response (non-ASCII bytes are represented as escape sequences):

\x0e\x0ay\x07\[email protected]!

This response was consistent for this server, but another server of the same network responded slightly differently:

\x0e\x0ay\x07\[email protected]!

A server from a different network which normally encrypts all communication also responded:

\x17\x06f\[email protected]!

It doesn’t take a lot of cryptanalysis knowledge to realize that an XOR cipher with a constant key is being applied here. Thanks to my “razor sharp deduction” I could conclude that the servers are replying with their respective names and these names are being XOR’ed with the string [email protected]!. Yes, likely the very same Charlie already mentioned at the start of this article. Hi, Charlie!

I didn’t risk sending the other message, not wanting to shut down a server accidentally. But maybe Shodan wants to extend their method of detecting PPPP servers: their current approach only works when no encryption is used, yet this message seems to get replies from all CS2 servers regardless of encryption.

Applications

Once a connection between the client and the device is established, MSG_DRW messages are exchanged in both directions. The messages will be delivered in order and retransmitted if lost, giving application developers something resembling a TCP stream if you don’t look too closely. In addition, each message is tagged with a channel ID, a number between 0 and 7. It looks like channel IDs are universally ignored by devices and are only relevant in the other direction. The idea seems to be that a client receiving a video stream should still be able to send commands to the device and receive responses over the same connection.

The PPPP protocol doesn’t make any recommendations about how applications should encode their data within that stream, and so they developed a number of wildly different application-level protocols. As a rule of thumb, all devices and clients on a particular PPPP network will always speak the same application-level protocol, though there might be slight differences in the supported capabilities. Different networks can share the same protocol, allowing them to be supported within the same application. Usually, there will be multiple applications implementing the same application-level protocol and working with the same PPPP networks, but I haven’t yet seen any applications supporting different protocols.

This allows grouping the applications by their application-level protocol. Applications within the same group are largely interchangeable, same devices can be accessed from any application. This doesn’t necessarily mean that everything will work correctly, as there might still be subtle differences. E.g. an application meant for visual doorbells probably accesses somewhat different functionality than one meant for security cameras even if both share the same protocol. Also, devices might be tied to the cloud infrastructure of a specific application, rendering them inaccessible to other applications working with the same PPPP network.

Fun fact: it is often very hard to know up front which protocol your device will speak. There is a huge thread with many spin-offs where people are attempting to reverse engineer A9 Mini cameras so that these can be accessed without an app. This effort is being massively complicated by the fact that all these cameras look basically the same, yet depending on the camera one out of at least four extremely different protocols could be used: HDWifiCamPro variant of SHIX JSON, YsxLite variant of iLnk binary, JXLCAM variant of CGI calls, or some protocol I don’t know because it isn’t based on PPPP.

The following is a list of PPPP-based applications I’ve identified so far, at least the ones with noteworthy user numbers. Mind you, these numbers aren’t necessarily indicative of the number of PPPP devices – some applications listed only use PPPP for some devices, likely using other protocols for most of their supported devices (particularly the ones that aren’t cameras). I try to provide a brief overview of the application-level protocol in the footnotes. Disclaimer: These applications tend to support a huge number of device prefixes in theory, so I mostly chose the “typical” ones based on which ones appear in YouTube videos or GitHub discussions.


文章来源: https://palant.info/2025/11/05/an-overview-of-the-pppp-protocol-for-iot-cameras/
如有侵权请联系:admin#unsafe.sh