tensorflowキュー、スレッド、入力

Tensorflow Queue Thread

このブログは純粋に転載されています

1.キューとスレッド

1.キュー：

1）、tf.FIFOQueue（capacity、dtypes、name = 'fifo_queue'）FIFO順に要素をキューに入れるキューを作成します

パラメーター：

容量：整数。このキューに格納できる要素の最大数

Dtypes：DTypeオブジェクトのリスト。 dtypeの長さは、各キュー要素のテンソルの数と同じである必要があります

方法：

Q.dequeue（）キューのデータを取得します

Q.enqueue（value）はデータをキューに追加します

Q.enqueue_many（リストまたはタプル）は複数のデータをキューに追加します

Q.size（）はキューのサイズを返します

2）、tf.RandomShuffleQueue（）ランダムキュー

2.キュー・マネージャー

tf.train.QueueRunner（queue、enqueue_ops = None）

パラメーター：

キュー：キュー

Enqueue_ops：スレッドのキュー操作リストを追加します[] * 2、2つのスレッドを指定します

Create_threads（sess、coord = None、start = False）特定のセッションのエンキュー操作を実行するスレッドを作成します

Start：ブール値。TrueがFalseの場合にスレッドを開始する場合、呼び出し元はstart（）を呼び出してスレッドを開始する必要があります。

Coord：スレッド管理のためのスレッドコーディネーター

3.スレッドコーディネーター

Tf.train.Coordinator（）スレッドコーディネーター、スレッドのグループの終了を調整するための簡単なメカニズムを実装します

メソッド：返されるのはスレッド調整インスタンスです

Request_stop（）リクエストストップ

Join（threads = None、stop_grace_period_secs = 120）スレッドが終了するのを待つ

キュー、キューマネージャー、スレッドコーディネーターを組み合わせて非同期を実現する小さな例：

import tensorflow as tf # 1. Create a queue Q = tf.FIFOQueue(2000, tf.float32) # 2. Add data to the queue # 2.1 Create a data (variable) var = tf.Variable(0.0, tf.float32) # 2.2 data increment plus = tf.assign_add(var, 1) # 2.3 Add data to the queue en_q = Q.enqueue(plus) # 3. Create a queue manager qr = tf.train.QueueRunner(Q, enqueue_ops=[en_q] * 2) # 4. Variable initialization init = tf.global_variables_initializer() # 5. Create a session with tf.Session() as sess: # 6. Run initialization sess.run(init) # 7. Create a thread coordinator coord = tf.train.Coordinator() # 8. Start child thread threads = qr.create_threads(sess, coord=coord, start=True) # 9. The main thread fetches data from the queue for i in range(200): print(sess.run(Q.dequeue())) # 10. Thread recycling coord.request_stop() coord.join(threads)

次に、読み取ったファイル

1.ファイル読み取りプロセス

2.ファイル読み取りAPI

1）ファイルキュー

Tf.train.string_input_producer（string_tensor 、、 shuffle = True）出力文字列（ファイル名など）をパイプラインキューに入力します

パラメーター：

String_tensorファイル名を持つ1次テンソル

Num_epochs：数回のデータの後、デフォルトは無限データです

戻り値：出力文字列を含むキュー

2）ファイルリーダー（ファイル形式に応じて、対応するファイルリーダーを選択します）

Csvファイル：クラスtf.TextLineReader○デフォルトで行ごとに読み取られます○戻り値：リーダーインスタンス

バイナリファイル：tf.FixedLengthRecordReader（record_bytes）_ record_bytes：整数、毎回読み取られるバイト数を指定します戻り値：リーダーインスタンス

TfRecordsファイル：tf.TFRecordReader戻り値：リーダーインスタンス

上記の3つのリーダーには同じ方法があります。

Read（file_queue）：キュー内の指定された量のコンテンツからTensorsタプル（キー、値）を返します。ここで、キーはファイル名、値はデフォルトのコンテンツ（行、バイト）です。

3）ファイルコンテンツデコーダー（ファイルから文字列を読み取るには、これらの文字列をテンソルに解析する関数が必要なため）

①tf.decode_csv（records、record_defaults = None、field_delim = None、name = None）

パラメーター：

レコード：テンソル文字列、各文字列はcsvのレコード行です

Field_delim：デフォルトのスプリッター '、'

Record_defaults：パラメーターは、取得されたテンソルのタイプを決定し、入力文字列に値を設定します。デフォルト値が使用されます。

②tf.decode_raw（bytes、out_type、little_endian = None、name = None）バイトはデジタルベクトル表現に変換され、バイトは文字列型テンソルであり、関数tf.FixedLengthRecordReaderで使用され、バイナリはuint8形式として読み取られます

4）スレッド操作を開始します

Tf.train.start_queue_runners（sess = None、coord = None）グラフ内のすべてのキュースレッドを収集し、スレッドを開始しますsess：セッションでcoord：thread coordinator return：すべてのスレッドキューを返します

5）パイプラインのバッチ読み取り終了

①tf.train.batch（tensors、batch_size、num_threads = 1、capacity = 32、name = None）指定されたサイズ（数）のテンソルを読み取ります

パラメーター：

テンソル：テンソルを含むリストにすることができます

Batch_size：キューから読み取られたバッチサイズ

Num_threads：キューに入るスレッドの数

容量：整数、キュー内の要素の最大数

リターン：テンサー

②tf.train.shuffle_batch（tensors、batch_size、capacity、min_after_dequeue、num_threads = 1、）指定されたサイズ（数値）のテンソルを順不同で読み取ります

パラメーター：

Min_after_dequeue：テンソルの数をキューに残し、ランダムなシャッフルを維持できる

3.ファイル読み取りの場合

import tensorflow as tf import os def csv_read(filelist): # Build file queue Q = tf.train.string_input_producer(filelist) # Build reader reader = tf.TextLineReader() # Read queue key, value = reader.read(Q) # Build decoder x1, y = tf.decode_csv(value, record_defaults=[['None'], ['None']]) # Perform pipeline batch processing x1_batch, y_batch = tf.train.batch([x1, y], batch_size=12, num_threads=1, capacity=12) # Start conversation with tf.Session() as sess: # Create thread coordinator coord = tf.train.Coordinator() # Start thread threads = tf.train.start_queue_runners(sess, coord=coord) # Execute task print(sess.run([x1_batch, y_batch])) # Thread recycling coord.request_stop() coord.join(threads) if __name__ == '__main__': filename = os.listdir('./data/') # File directory specified by itself filelist = [os.path.join('./data/', file) for file in filename] csv_read(filelist)

3.画像の読み取りと保存

1画像デジタル化の3つの要素：長さ、幅、チャネル数（1チャネル：グレー値）3チャネル：RGB

2画像サイズを小さくします。

Tf.image.resize_images（images、size）画像をズームアウトします

目的：

1.画像データの均一性を高める

2.すべての画像が指定されたサイズに変換されます

3.オーバーヘッドの増加を防ぐために、画像データの量を減らします

3画像読み取りAPI

1）イメージリーダー

Tf.WholeFileReaderファイルの内容全体を値として出力するリーダー

戻り値：リーダーインスタンスread（file_queue）：出力はファイル名（キー）とファイルの内容（値）になります

2）画像デコーダー

Tf.image.decode_jpeg（contents）JPEGでエンコードされた画像をuint8テンソルにデコードします

戻り値：uint8テンソル、3D形状[高さ、幅、チャネル]

Tf.image.decode_png（contents）PNGでエンコードされた画像をuint8またはuint16テンソルにデコードします

戻り値：テンソルタイプ、3D形状[高さ、幅、チャネル]

画像読み取りケースの簡単なデモ：

import tensorflow as tf import os flags = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('data_home', './data/dog/', 'dog's picture directory') # File path specified by yourself def picread(filelist): # Build file name queue file_q = tf.train.string_input_producer(filelist) # Build reader reader = tf.WholeFileReader() # Read content key, value = reader.read(file_q) print(value) # Build decoder image = tf.image.decode_jpeg(value) print(image) # Unify the picture size Set the length and width resize_image = tf.image.resize_images(image, [256,256]) print(resize_image) # Specify the channel size resize_image.set_shape([256,256,3]) # Build batch processing pipeline image_batch = tf.train.batch([resize_image], batch_size=100,num_threads=1, capacity=100) return image_batch if __name__ == '__main__': filename = os.listdir(flags.data_home) filelist = [os.path.join(flags.data_home, file) for file in filename] image_batch = picread(filelist) with tf.Session() as sess: # Build a thread coordinator coord = tf.train.Coordinator() # Start thread threads = tf.train.start_queue_runners(sess,coord=coord) # Training data print(sess.run(image_batch)) # Recycle thread coord.request_stop() coord.join(threads)

第四に、TFRecords分析、アクセス

1コンセプト

TFRecordsは、Tensorflowによって設計された組み込みファイル形式です。これは、メモリをより有効に活用し、コピーと移動に便利なバイナリファイルです（バイナリデータとタグ（トレーニング済みカテゴリタグ）データを同じファイルに保存します）

2TFRecordsファイル分析

1）ファイル形式：*。tfrecords

2）書き込まれたファイルの内容：プロトコルブロックの例

3TFRecordsストレージ

1）TFRecordメモリを作成します

tf.python_io.TFRecordWriter（path）tfrecordsファイルに書き込みます

パラメーター：

パス：TFRecordsファイルのパス

戻り値：なし、ファイル書き込み操作を実行します

方法：

Write（record）：文字列レコードをファイルに書き込みます##シリアル化された例、Example.SerializeToString（）

Close（）：ファイルライターを閉じます

2）各サンプルのサンプルプロトコルブロックを作成します

Tf.train.Example（features = None）tfrecordsファイルを書き込みます

パラメーター：

機能：tf.train.Featuresタイプ機能の例

戻り値：フォーマットプロトコルブロックの例

Tf.train.Features（feature = None）各サンプルの情報キーと値のペアを作成します

パラメーター：

機能：辞書データ、キーは保存する名前です

値はtf.train.Featureインスタンスです

戻り値：機能タイプ

tf.train.Feature（** options）

パラメーター：

**オプション：たとえばbytes_list = tf.train。 BytesList（value = [Bytes]）

int64_list = tf.train。 Int64List（value = [Value]）

float_list = tf.train。 FloatList（value = [value]）

4TFRecordsの読み取り方法

1）ファイルキューを作成する

tf.train.string_input_producer（string_tensor 、、 shuffle = True）

2）ファイルリーダーを作成し、キュー内のデータを読み取ります

Tf.TFRecordReader戻り値：リーダーインスタンス

read（file_queue）

3）TFRecordsのサンプルプロトコルメモリブロックを解析します

①tf.parse_single_example（serialized、features = None、name = None）単一のサンプルプロトタイプを解析します

パラメーター：

シリアル化：スカラー文字列Tensor、シリアル化された例

機能：辞書データを口述し、キーは読み取り名、値はFixedLenFeatureです

戻り値：キーと値のペアで構成される辞書。キーは読み取りの名前です。

②tf.FixedLenFeature（shape、dtype）

パラメーター：

形状：入力データの形状は、通常は指定されていませんが、空のリストです

dtype：入力データのタイプは、ファイルに保存されているタイプと同じである必要があります。タイプはfloat32、int64、stringのみです。

4）デコード

tf.decode_raw（bytes、out_type、little_endian = None、name = None）バイトを数値ベクトル表現に変換します。バイトは文字列型テンソルであり、関数tf.FixedLengthRecordReaderで使用され、バイナリはuint8形式として読み取られます。

以下は、バイナリファイルからデータを読み取り、tfrecordsファイルに書き込み、次にtfrecordsファイルから読み取る小さなケースです。

import tensorflow as tf import os flags = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('data_home', './data/cifar10/cifar-10-batches-bin/', 'binary file directory') tf.app.flags.DEFINE_string('data_tfrecords', './data/temp/tfrecords', 'tfrecords file path') class cifarread(object): def __init__(self, filelist): self.filelist = filelist # Build some data of the graph self.height = 32 self.width = 32 self.channel = 3 self.label_bytes = 1 self.image_bytes = self.height * self.width*self.channel self.bytes = self.label_bytes + self.image_bytes def read_decode(self): ''' Read binary file :return: image_batch, label_batch ''' # Build file name queue file_q = tf.train.string_input_producer(self.filelist) # Build reader reader = tf.FixedLengthRecordReader(record_bytes=self.bytes) # Read data key, value = reader.read(file_q) # Decode label_image = tf.decode_raw(value, tf.uint8) # Split data set label = tf.cast(tf.slice(label_image, [0], [self.label_bytes]), tf.int32) image = tf.slice(label_image, [self.label_bytes], [self.image_bytes]) # Change shape image_tensor = tf.reshape(image, [self.height, self.width, self.channel]) # Batch processing image_batch, label_batch = tf.train.batch([image_tensor, label], batch_size=10, num_threads=1, capacity=10) return image_batch, label_batch def write2tfrecords(self, image_batch, label_batch): ''' Write content read from binary file to tfrecords file :param image_batch: :param label_batch: :return: ''' # Build a tfrecords file storage writer = tf.python_io.TFRecordWriter(flags.data_tfrecords) # For each sample, we must construct the example to write for i in range(10): # Take out the characteristic value and convert it into a string image_string = image_batch[i].eval().tostring() # Take out the target value label_int = int(label_batch[i].eval()[0]) example = tf.train.Example(features=tf.train.Features(feature={ 'image':tf.train.Feature(bytes_list = tf.train.BytesList(value=[image_string])), 'label':tf.train.Feature(int64_list = tf.train.Int64List(value=[label_int])) })) # Write to the file, the serialized value of the protocol must be stored before writer.write(example.SerializeToString()) writer.close() return None def read_tfrecords(self): ''' Read content from tfrecords file :return: image_batch, label_batch ''' # Construct file queue file_q = tf.train.string_input_producer([flags.data_tfrecords]) # Construct reader, read data reader = tf.TFRecordReader() # Read only one sample at a time key, value = reader.read(file_q) # Parse content Parse example protocol feature = tf.parse_single_example(value, features={ 'image':tf.FixedLenFeature([], tf.string), 'label':tf.FixedLenFeature([], tf.int64) }) # Decode String needs to be decoded, not used for shaping image = tf.decode_raw(feature['image'], tf.uint8) # Set the shape of the picture for batch processing image_reshape = tf.reshape(image, [self.height, self.width]) label = tf.cast(feature['label'], tf.int32) # Batch processing image_batch, label_batch = tf.train.batch([image_reshape, label],batch_size=10 ,num_threads=1, capacity=10) return image_batch, label_batch if __name__ == '__main__': filename = os.listdir(flags.data_home) filelist = [os.path.join(flags.data_home, file) for file in filename if file[-3:] == 'bin'] cif = cifarread(filelist) # Read binary file image_batch, label_batch = cif.read_decode() # Read tfrecords file # cif.read_tfrecords() with tf.Session() as sess: # Build a thread coordinator coord = tf.train.Coordinator() # Start thread threads = tf.train.start_queue_runners(sess, coord=coord) # Execute task print(sess.run([image_batch, label_batch])) # Store tfrecords file # cif.write2tfrecords(image_batch, label_batch) # Recycle thread coord.request_stop() coord.join(threads)

tensorflowキュー、スレッド、入力

1.キュー：

2.キュー・マネージャー

3.スレッドコーディネーター

キュー、キューマネージャー、スレッドコーディネーターを組み合わせて非同期を実現する小さな例：

次に、読み取ったファイル

1.ファイル読み取りプロセス

2.ファイル読み取りAPI

1）ファイルキュー

2）ファイルリーダー（ファイル形式に応じて、対応するファイルリーダーを選択します）

3）ファイルコンテンツデコーダー（ファイルから文字列を読み取るには、これらの文字列をテンソルに解析する関数が必要なため）

4）スレッド操作を開始します

5）パイプラインのバッチ読み取り終了

3.ファイル読み取りの場合

3.画像の読み取りと保存

1画像デジタル化の3つの要素：長さ、幅、チャネル数（1チャネル：グレー値）3チャネル：RGB

2画像サイズを小さくします。

3画像読み取りAPI

第四に、TFRecords分析、アクセス

1コンセプト

2TFRecordsファイル分析

3TFRecordsストレージ

4TFRecordsの読み取り方法

1）ファイルキューを作成する

2）ファイルリーダーを作成し、キュー内のデータを読み取ります

3）TFRecordsのサンプルプロトコルメモリブロックを解析します

4）デコード

カテゴリー

興味深い記事

KeyValueTextInputFormatタイプのセグメンテーションフォーマット

解決できませんでしたダイナミックライブラリ「cudart64_101.dll」; dlerror：cudart64_101.dllが見つかりません

デジタル画像処理の素晴らしい写真、レナレナの裏話

星の点滅効果を実現するCSSアニメーション

JSPが表示されました '1.7未満のソースレベルの文字列型の値をオンに切り替えることはできません'ソリューション

[Python]-どのサイトがDjangoフレームワークを使用しているかに精通している

ogvファイルをmp4に変換するにはどうすればよいですか？

[アルゴリズムの設計と分析]分割統治戦略：チップテスト

論文：モデルベースのアクセラレーションによる継続的なディープQ学習

Sharepoint-新しいリストアイテム-フォームのフィールドの順序を変更するにはどうすればよいですか？

人気の投稿