Forums: 公開討議 (Thread #26595)

円記号(U+00a5)の変換 (2010-06-19 20:22 by Anonymous #51410)

U+00a5(円記号)をほかのコードに変換すると、全角の円記号(例えば、EUCなら0xA1EF)になりますが、0x5cに変換するオプションはありますでしょうか。
一部のアプリケーションで Shift_JIS のテキストを UTF に変換したときに、0x5c を U+00a5 にしてしまうものがあって困っています。

Reply to #51410×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-19 21:06 by naruse #51413)

--cp932 で変換されます。
ただ、いくつかほかの文字も一緒に変換されてしまうのでそこはご留意のほどを。
Reply to #51410

Reply to #51413×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-19 22:36 by Anonymous #51415)

ありがとうございました。
--cp932 で変換した後にさらに -e で EUC にしようかと思いましたが、--oc=CP51932 でもよいのでしょうか。
Reply to #51413

Reply to #51415×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-19 23:18 by naruse #51416)

Windows風にシフトJISからEUCにしたい場合でしたら、
--ic=CP932 --oc=CP51932 がよいですね
Reply to #51415

Reply to #51416×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-20 00:04 by Anonymous #51419)

やりたいことは、文字コードが不明なテキスト(たくさん)をバッチで処理して、ISO-2022-JP で定義されるコードに収まっている EUC に統一することです。
今まで教えていただいたことをもとにすると
-I --oc=CP51932
で目的は達成できそうですね。

ところで、CP932 や CP51932 を指定しなければ、U+00a5 が区点コード 0179 の全角の円記号に変換されるのはどうしてなのでしょうか。
Reply to #51416

Reply to #51419×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-20 04:56 by naruse #51425)

CP932などを指定しなかった場合はJIS風の変換になります。
また、nkfでは0x00-0x7Fの範囲は常にASCIIであると仮定します。
結果、U+00A5はJIS X 0208の円記号、つまり全角円記号になります。

などと理屈はつけられますが、端的に言えば昔そうなってて、
今も互換性のためにそのままになっている、というのが本質ですね。
Reply to #51419

Reply to #51425×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login

RE: 円記号(U+00a5)の変換 (2010-06-20 23:01 by Anonymous #51444)

どうもありがとうございました。勉強になりました。
Reply to #51425

Reply to #51444×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Login