大家好,欢迎来到IT知识分享网。
Pjsip的Conference会议桥,主要的功能是抽象media的输入输出为port,并把port中的PCM数据进行混音,已达到多方通话的混音功能。
对conference感兴趣的主要是两点:
- 怎么抽象port
- 怎么混音
conference的代码位置:pjmedia/src/pjmedia/conference.c
conference在Pjsip的media流中所处的位置(这里引用其他人的图):
1. 抽象Port
conference在抽象port时,把sound音频设备和stream都抽象成一个port,由conference管理port,和完成各个port的connect操作。
port的基本操作数据结构定义:
/**
* Port interface.
*/
typedef struct pjmedia_port
{
pjmedia_port_info info; /**< Port information. */
/** Port data can be used by the port creator to attach arbitrary
* value to be associated with the port.
*/
struct port_data {
void *pdata; /**< Pointer data. */
long ldata; /**< Long data. */
} port_data;
/**
* Get clock source.
* This should only be called by #pjmedia_port_get_clock_src().
*/
pjmedia_clock_src* (*get_clock_src)(struct pjmedia_port *this_port,
pjmedia_dir dir);
/**
* Sink interface.
* This should only be called by #pjmedia_port_put_frame().
*/
pj_status_t (*put_frame)(struct pjmedia_port *this_port,
pjmedia_frame *frame);
/**
* Source interface.
* This should only be called by #pjmedia_port_get_frame().
*/
pj_status_t (*get_frame)(struct pjmedia_port *this_port,
pjmedia_frame *frame);
/**
* Called to destroy this port.
*/
pj_status_t (*on_destroy)(struct pjmedia_port *this_port);
} pjmedia_port;
Port的主要操作接口有3个:put_frame,get_frame,on_destroy
其中,put_frame,是把sound port产生的数据(即mic的录音数据)一步步的传输到网络中。
get_frame,是把网络中接收到的数据,一步步的传输到sound设备中播放。
on_destroy是销毁port时使用。
这是所有port抽象的基础。
在看下,conference是怎么管理port,conference定义了自己的一个port数据结构,用于管理抽象的port。
conference定义的port数据结构如下:
/**
* This is a port connected to conference bridge.
*/
struct conf_port
{
pj_str_t name; /**< Port name. */
pjmedia_port *port; /**< get_frame() and put_frame() */
pjmedia_port_op rx_setting; /**< Can we receive from this port */
pjmedia_port_op tx_setting; /**< Can we transmit to this port */
unsigned listener_cnt; /**< Number of listeners. */
SLOT_TYPE *listener_slots;/**< Array of listeners. */
unsigned transmitter_cnt;/**<Number of transmitters. */
/* Shortcut for port info. */
unsigned clock_rate; /**< Port's clock rate. */
unsigned samples_per_frame; /**< Port's samples per frame. */
unsigned channel_count; /**< Port's channel count. */
/* Calculated signal levels: */
unsigned tx_level; /**< Last tx level to this port. */
unsigned rx_level; /**< Last rx level from this port. */
/* The normalized signal level adjustment.
* A value of 128 (NORMAL_LEVEL) means there's no adjustment.
*/
unsigned tx_adj_level; /**< Adjustment for TX. */
unsigned rx_adj_level; /**< Adjustment for RX. */
/* Resample, for converting clock rate, if they're different. */
pjmedia_resample *rx_resample;
pjmedia_resample *tx_resample;
/* RX buffer is temporary buffer to be used when there is mismatch
* between port's sample rate or ptime with conference's sample rate
* or ptime. The buffer is used for sampling rate conversion AND/OR to
* buffer the samples until there are enough samples to fulfill a
* complete frame to be processed by the bridge.
*
* When both sample rate AND ptime of the port match the conference
* settings, this buffer will not be created.
*
* This buffer contains samples at port's clock rate.
* The size of this buffer is the sum between port's samples per frame
* and bridge's samples per frame.
*/
pj_int16_t *rx_buf; /**< The RX buffer. */
unsigned rx_buf_cap; /**< Max size, in samples */
unsigned rx_buf_count; /**< # of samples in the buf. */
/* Mix buf is a temporary buffer used to mix all signal received
* by this port from all other ports. The mixed signal will be
* automatically adjusted to the appropriate level whenever
* there is possibility of clipping.
*
* This buffer contains samples at bridge's clock rate.
* The size of this buffer is equal to samples per frame of the bridge.
*/
int mix_adj; /**< Adjustment level for mix_buf. */
int last_mix_adj; /**< Last adjustment level. */
pj_int32_t *mix_buf; /**< Total sum of signal. */
/* Tx buffer is a temporary buffer to be used when there's mismatch
* between port's clock rate or ptime with conference's sample rate
* or ptime. This buffer is used as the source of the sampling rate
* conversion AND/OR to buffer the samples until there are enough
* samples to fulfill a complete frame to be transmitted to the port.
*
* When both sample rate and ptime of the port match the bridge's
* settings, this buffer will not be created.
*
* This buffer contains samples at port's clock rate.
* The size of this buffer is the sum between port's samples per frame
* and bridge's samples per frame.
*/
pj_int16_t *tx_buf; /**< Tx buffer. */
unsigned tx_buf_cap; /**< Max size, in samples. */
unsigned tx_buf_count; /**< # of samples in the buffer. */
/* When the port is not receiving signal from any other ports (e.g. when
* no other ports is transmitting to this port), the bridge periodically
* transmit NULL frame to the port to keep the port "alive" (for example,
* a stream port needs this heart-beat to periodically transmit silence
* frame to keep NAT binding alive).
*
* This NULL frame should be sent to the port at the port's ptime rate.
* So if the port's ptime is greater than the bridge's ptime, the bridge
* needs to delay the NULL frame until it's the right time to do so.
*
* This variable keeps track of how many pending NULL samples are being
* "held" for this port. Once this value reaches samples_per_frame
* value of the port, a NULL frame is sent. The samples value on this
* variable is clocked at the port's clock rate.
*/
unsigned tx_heart_beat;
/* Delay buffer is a special buffer for sound device port (port 0, master
* port) and other passive ports (sound device port is also passive port).
*
* We need the delay buffer because we can not expect the mic and speaker
* thread to run equally after one another. In most systems, each thread
* will run multiple times before the other thread gains execution time.
* For example, in my system, mic thread is called three times, then
* speaker thread is called three times, and so on. This we call burst.
*
* There is also possibility of drift, unbalanced rate between put_frame
* and get_frame operation, in passive ports. If drift happens, snd_buf
* needs to be expanded or shrinked.
*
* Burst and drift are handled by delay buffer.
*/
pjmedia_delay_buf *delay_buf;
};
这个数据结构很多数据定义,但在分析代码时,关注几个数据就行。
- port 所有的port都公有的数据结构,Port操作的数据的接口在这定义
- listener_cnt 监听这个port的其他port总数
- listener_slots 保存listener port的port id,这个和listener_cnt一起使用,记录本port需要把本port的数据给多少个port传输
- transmitter_cnt 往本port传输数据的port数。
- mix_buf 混音buf,主要定义的类型为int32位的数组,而Pcm的数据定义的是int16位的。
- delay_buf 缓冲buf。
个人认为delay_buf是conf_port的定义中的一个精华点。为什么呢?看下delay_buf的说明:
/* Delay buffer is a special buffer for sound device port (port 0, master
* port) and other passive ports (sound device port is also passive port).
*
* We need the delay buffer because we can not expect the mic and speaker
* thread to run equally after one another. In most systems, each thread
* will run multiple times before the other thread gains execution time.
* For example, in my system, mic thread is called three times, then
* speaker thread is called three times, and so on. This we call burst.
*
* There is also possibility of drift, unbalanced rate between put_frame
* and get_frame operation, in passive ports. If drift happens, snd_buf
* needs to be expanded or shrinked.
*
* Burst and drift are handled by delay buffer.
*/
delay_buf就是为了sound设备的录音和播放同步而设置的。mic录音线程把录制的pcm数据存入delay_buf中,speak播放线程在把delay_buf中的数据取出,在speak的线程中进行混音,这样就解决了pcm声音混音的同步问题。
在看下conference的数据定义:
/*
* Conference bridge.
*/
struct pjmedia_conf
{
unsigned options; /**< Bitmask options. */
unsigned max_ports; /**< Maximum ports. */
unsigned port_cnt; /**< Current number of ports. */
unsigned connect_cnt; /**< Total number of connections */
pjmedia_snd_port *snd_dev_port; /**< Sound device port. */
pjmedia_port *master_port; /**< Port zero's port. */
char master_name_buf[80]; /**< Port0 name buffer. */
pj_mutex_t *mutex; /**< Conference mutex. */
struct conf_port **ports; /**< Array of ports. */
unsigned clock_rate; /**< Sampling rate. */
unsigned channel_count;/**< Number of channels (1=mono). */
unsigned samples_per_frame; /**< Samples per frame. */
unsigned bits_per_sample; /**< Bits per sample. */
};
其实pjmedia_conf就是一个conf_port的管理器,其中,master_port是sound 设备抽象的port定义,snd_dev_port才是真正的sound设备。snd_dev_port通过master_port驱动conference的运行。
ports是一个二维数组,用于保存conference中所有的conf_port,其中数组的第一个元素0,是给master_port的。也就是sound port。
代码中是这样关联的:
在pjmedia_conf_create 函数中,有一段这样代码:
conf->master_port->port_data.pdata = conf;
conf->master_port->port_data.ldata = 0;
其中ldata中的0,就是ports数据的id,为数组的第一个元素。
在create_sound_port函数中,
注意这段代码:
/* Add the port to the bridge */
conf->ports[0] = conf_port;
conf->port_cnt++;
通过这两段代码,把master_port和conf_port关联上了。
而conference是怎么运行的呢?
conference是通过mast_port来驱动的,而master_port被snd_dev_port调用,所以conference是通过sound的play_cb播放线程驱动。
2.怎么混音
混音的原理图:
- pjmedia_conf_connect_port( pjmedia_conf *conf, unsigned src_slot, unsigned sink_slot, int level )接口把两个Port连接在一起,如图所示
- connect的时候,sink_slot就是src_slot的listener
- 混音就是把src_slot的frame依次和listener的mix_buf混音,混音的实质,是把本port的所有输入PCM数据混音,然后输出。如图中,Port0把get_frame中的PCM数据和Port1的mix_buf中的PCM数据混音,其实就是Port1把Port0的输入混音。
混音是在get_frame函数中进行。
免责声明:本站所有文章内容,图片,视频等均是来源于用户投稿和互联网及文摘转载整编而成,不代表本站观点,不承担相关法律责任。其著作权各归其原作者或其出版社所有。如发现本站有涉嫌抄袭侵权/违法违规的内容,侵犯到您的权益,请在线联系站长,一经查实,本站将立刻删除。 本文来自网络,若有侵权,请联系删除,如若转载,请注明出处:https://yundeesoft.com/14526.html