[iOS] H.264コーデック(VideoToolboxハードコーデック)
H 264 Codec
前書き
iOS 8.0以降、AppleはハードおよびハードデコードAPIを公開しました。VideoToolbox
純粋なC言語APIです。多くのC言語関数が含まれていますVideoToolbox
ハードウェアエンコーダーとデコーダーに直接アクセスできる低レベルのフレームワークです。ビデオの圧縮および解凍サービスを提供します。この記事の主な目的はH.264 hard coded
コーデックの説明です。 H.264関連の知識については、を参照してください。 H.264の紹介とコーディングの原則 この記事はあまり説明していません!
コーディング
Hard decoding
:GPUでデコードし、CPU操作を削減します- 利点:スムーズな再生、低消費電力、高速デコード
- 短所:互換性が低い
Soft decoding
:(ffmpeg)などのCPUでデコード- 利点:優れた互換性
- 短所:CPU負荷の増加、消費電力の増加、ハードデコードがスムーズでなく、デコード速度が遅い
VideoToolboxエンコーディング:
1.最初にインポートする必要があります#import
2.エンコードセッションを初期化します
@property (nonatomic, assign) VTCompressionSessionRef compressionSession // Initialize the encoder - (void)setupVideoSession { / / 1. Used to record the current number of frames of data self.frameID = 0 // 2. Record the width & height of the video, modify it according to actual needs int width = 720 int height = 1280 // 3. Create a CompressionSession object that is used to encode the image OSStatus status = VTCompressionSessionCreate(NULL, // Dispatcher for the session. Pass NULL to use the default allocator. Width, // The width of the frame, in pixels. Height, // The height of the frame, in pixels. kCMVideoCodecType_H264, // Codec type indicating encoding using h.264 NULL, // specifies the specific video encoder that must be used. Pass NULL to let the video toolbox select the encoder. NULL, // The required property of the source pixel buffer to create a pixel buffer pool. If you don't want the video toolbox to create one for you, pass NULL NULL, // The allocator of compressed data. Pass NULL to use the default allocator. didCompressH264, // When the encoding ends, the callback will be performed in the function, and the data can be written to the file in the function. (__bridge void *)(self), // outputCallbackRefCon &_compressionSession) // A compressed session that points to a variable to receive. if (status != 0){ NSLog(@'H264: session creation failed') return } // 4. Set the real-time encoding output (the live stream must be real-time output, otherwise there will be a delay) VTSessionSetProperty(_compressionSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue) VTSessionSetProperty(_compressionSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Baseline_AutoLevel) // 5. Set the key frame (GOPsize) interval int frameInterval = 60 CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval) VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef) // 6. Set the desired frame rate (how many frames per second, if the frame rate is too low, it will cause the picture to jam) int fps = 24 CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps) VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef) // 7. Set the code rate (code rate: coding efficiency, the higher the code rate, the clearer the picture, if the code rate is lower, it will cause mosaic -> high code rate is good for restoring the original picture, but it is not conducive to transmission) int bitRate = width * height * 3 * 4 * 8 CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate) VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef) // 8. Set the bit rate, the mean, the unit is byte. This is an algorithm. NSArray *limit = @[@(bitRate * 1.5/8), @(1)] VTSessionSetProperty(self.compressionSession, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFArrayRef)limit) // 9. The basic settings are finished, ready to code VTCompressionSessionPrepareToEncodeFrames(_compressionSession) }
3.エンコード完了コールバック関数
// code completion callback void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) { // 1. Determine if the status is equal to no error if (status != noErr) { return } if (!CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@'didCompressH264 data is not ready ') return } // 2. Get the object based on the parameters passed in VideoH264EnCode* encoder = (__bridge VideoH264EnCode*)outputCallbackRefCon // 3. Determine if it is a keyframe bool isKeyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync) / / Determine whether the current frame is a key frame // Get sps & pps data if (isKeyframe) { // Get the encoded information (stored in CMFormatDescriptionRef) CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer) // Get SPS information size_t sparameterSetSize, sparameterSetCount const uint8_t *sparameterSet CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 ) // Get PPS information size_t pparameterSetSize, pparameterSetCount const uint8_t *pparameterSet CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 ) / / Install sps / pps into NSData NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize] NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize] // write to file [encoder gotSpsPps:sps pps:pps] } / / Get the data block CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer) size_t length, totalLength char *dataPointer OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer) if (statusCodeRet == noErr) { size_t bufferOffset = 0 Static const int AVCCHeaderLength = 4 // The first four bytes of the returned nalu data are not the startcode of 0001, but the length of the frame length of the big endian mode / / Loop to get nalu data while (bufferOffset 4. SPS / PPS、およびI、P、Bフレームデータを取得し、ブロックを介してコールバックします
// Get sps and pps and startCode - (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps{ // splicing the StartCode of NALU, the default is 00000001 const char bytes[] = 'x00x00x00x01' size_t length = (sizeof bytes) - 1 NSData *ByteHeader = [NSData dataWithBytes:bytes length:length] NSMutableData *h264Data = [[NSMutableData alloc] init] [h264Data appendData:ByteHeader] [h264Data appendData:sps] if (self.h264DataBlock) { self.h264DataBlock(h264Data) } [h264Data resetBytesInRange:NSMakeRange(0, [h264Data length])] [h264Data setLength:0] [h264Data appendData:ByteHeader] [h264Data appendData:pps] if (self.h264DataBlock) { self.h264DataBlock(h264Data) } } - (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame{ const char bytes[] = 'x00x00x00x01' size_t length = (sizeof bytes) - 1 //string literals have implicit trailing ' ' NSData *ByteHeader = [NSData dataWithBytes:bytes length:length] NSMutableData *h264Data = [[NSMutableData alloc] init] [h264Data appendData:ByteHeader] [h264Data appendData:data] if (self.h264DataBlock) { self.h264DataBlock(h264Data) } }
5.元のフレームデータを渡して呼び出しとコールバック
// Encode sampleBuffer (camera capture data, raw frame data) to H.264 - (void)encodeSampleBuffer:(CMSampleBufferRef)sampleBuffer H264DataBlock:(void (^)(NSData * _Nonnull))h264DataBlock{ if (!self.compressionSession) { return } // 1. Save the block block self.h264DataBlock = h264DataBlock // 2. Convert sampleBuffer to imageBuffer CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer) // 3. Create CMTime time based on the current number of frames CMTime presentationTimeStamp = CMTimeMake(self.frameID++, 1000) VTEncodeInfoFlags flags // 4. Start encoding the frame data OSStatus statusCode = VTCompressionSessionEncodeFrame( self.compressionSession, imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, (__bridge void * _Nullable)(self), &flags ) if (statusCode != noErr) { NSLog(@'H264: VTCompressionSessionEncodeFrame failed with %d', (int)statusCode) VTCompressionSessionInvalidate(self.compressionSession) CFRelease(self.compressionSession) self.compressionSession = NULL return } }
上記のいくつかの方法により、元のフレームデータのH.264エンコーディングを完了し、プロジェクトのニーズに応じて、コールバックを呼び出し元にコールバックします。具体的には、スプライシングと使用方法を示します。
VideoToolboxのデコード:
Decoding and encoding are just the opposite. After getting the data of each frame of H.264, the decoding operation is performed, and finally the original frame data is obtained for display.
デコーダーを初期化します
- (BOOL)initH264Decoder { if(_deocderSession) { return YES } const uint8_t* const parameterSetPointers[2] = { _sps, _pps } const size_t parameterSetSizes[2] = { _spsSize, _ppsSize } OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, //param count parameterSetPointers, parameterSetSizes, 4, //nal start code size &_decoderFormatDescription) if(status == noErr) { NSDictionary* destinationPixelBufferAttributes = @{ (id)kCVPixelBufferPixelFormatTypeKey : [NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange], //The hard solution must be kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange or kCVPixelFormatType_420YpCbCr8Planar //This is high and the code is reversed. (id)kCVPixelBufferOpenGLCompatibilityKey : [NSNumber numberWithBool:YES] } VTDecompressionOutputCallbackRecord callBackRecord callBackRecord.decompressionOutputCallback = didDecompress callBackRecord.decompressionOutputRefCon = (__bridge void *)self status = VTDecompressionSessionCreate(kCFAllocatorDefault, _decoderFormatDescription, NULL, (__bridge CFDictionaryRef)destinationPixelBufferAttributes, &callBackRecord, &_deocderSession) VTSessionSetProperty(_deocderSession, kVTDecompressionPropertyKey_ThreadCount, (__bridge CFTypeRef)[NSNumber numberWithInt:1]) VTSessionSetProperty(_deocderSession, kVTDecompressionPropertyKey_RealTime, kCFBooleanTrue) } else { NSLog(@'IOS8VT: reset decoder session failed status=%d', (int)status) } return YES }
2.デコード操作
// decoding operation, external call - (void)decodeNalu:(uint8_t *)frame size:(uint32_t) frameSize{ int nalu_type = (frame[4] & 0x1F) CVPixelBufferRef pixelBuffer = NULL uint32_t nalSize = (uint32_t)(frameSize - 4) uint8_t *pNalSize = (uint8_t*)(&nalSize) frame[0] = *(pNalSize + 3) frame[1] = *(pNalSize + 2) frame[2] = *(pNalSize + 1) frame[3] = *(pNalSize) // When transmitting. Key frames cannot be lost. Otherwise, green screen B/P can be lost. switch (nalu_type) { case 0x05: // Keyframe if([self initH264Decoder]) { pixelBuffer = [self decode:frame withSize:frameSize] } break case 0x07: // sps _spsSize = frameSize - 4 _sps = malloc(_spsSize) memcpy(_sps, &frame[4], _spsSize) break case 0x08: { // pps _ppsSize = frameSize - 4 _pps = malloc(_ppsSize) memcpy(_pps, &frame[4], _ppsSize) break } default: { // B/P other frames if([self initH264Decoder]){ pixelBuffer = [self decode:frame withSize:frameSize] } break } } } - (CVPixelBufferRef)decode:(uint8_t *)frame withSize:(uint32_t)frameSize{ CVPixelBufferRef outputPixelBuffer = NULL CMBlockBufferRef blockBuffer = NULL OSStatus status = CMBlockBufferCreateWithMemoryBlock(NULL, (void *)frame, frameSize, kCFAllocatorNull, NULL, 0, frameSize, FALSE, &blockBuffer) if(status == kCMBlockBufferNoErr) { CMSampleBufferRef sampleBuffer = NULL const size_t sampleSizeArray[] = {frameSize} status = CMSampleBufferCreateReady(kCFAllocatorDefault, blockBuffer, _decoderFormatDescription , 1, 0, NULL, 1, sampleSizeArray, &sampleBuffer) if (status == kCMBlockBufferNoErr && sampleBuffer) { VTDecodeFrameFlags flags = 0 VTDecodeInfoFlags flagOut = 0 OSStatus decodeStatus = VTDecompressionSessionDecodeFrame(_deocderSession, sampleBuffer, flags, &outputPixelBuffer, &flagOut) if(decodeStatus == kVTInvalidSessionErr) { NSLog(@'IOS8VT: Invalid session, reset decoder session') } else if(decodeStatus == kVTVideoDecoderBadDataErr) { NSLog(@'IOS8VT: decode failed status=%d(Bad data)', (int)decodeStatus) } else if(decodeStatus != noErr) { NSLog(@'IOS8VT: decode failed status=%d', (int)decodeStatus) } CFRelease(sampleBuffer) } CFRelease(blockBuffer) } return outputPixelBuffer }
3.完全なコールバックのデコード
// decode callback function static void didDecompress( void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef pixelBuffer, CMTime presentationTimeStamp, CMTime presentationDuration ){ CVPixelBufferRef *outputPixelBuffer = (CVPixelBufferRef *)sourceFrameRefCon *outputPixelBuffer = CVPixelBufferRetain(pixelBuffer) VideoH264Decoder *decoder = (__bridge VideoH264Decoder *)decompressionOutputRefCon if ([decoder.delegate respondsToSelector:@selector(decoder:didDecodingFrame:)]) { [decoder.delegate decoder: decoder didDecodingFrame:pixelBuffer] } }
4.OpenGLを介したフレームデータ表示
コードは投稿されていません。デモで確認できます。
この記事では、主にVideoToolboxを使用して、iPhoneカメラでキャプチャされたビデオストリームを記録します。Encoding and decoding
そしてshowは、基本的なコーデック機能、特にプロジェクトでの使用方法のみを提供しますが、独自のプロジェクトによると、ビデオストリーミング、あなたは参照することができます Socket&CocoaAsyncSocketの紹介と使用 そして、粘着性のあるバッグやその他の問題に対処する方法
詳細な使用法については、以下を参照してください。 VideoToolboxの公式ドキュメント