首页 \ 问答 \ 将文本从多个文件,相同的名称复制到bash(linux)中的不同路径(Copy text from multiple files, same names to different path in bash (linux))

将文本从多个文件,相同的名称复制到bash(linux)中的不同路径(Copy text from multiple files, same names to different path in bash (linux))

我需要帮助将各种文件中的内容复制到其他文件(相同的名称和格式,不同的路径)。

例如, $HOME/initial/baby.desktop有我需要写入$HOME/scripts/baby.desktop 。 这对于单个文件来说非常简单,但我在$HOME/initial/有2500个文件,在$HOME/scripts/有相同的编号(相同的名称和格式)。 我想将路径A中文件的内容追加(复制)到路径B(具有相同的名称和格式),到路径B中的文件末尾,而不删除路径B中的文件内容。

$HOME/initial/*.desktop到最终$HOME/scripts/*.desktop示例内容。 我尝试了以下,但它不起作用:

cd $HOME/initial/

for i in $( ls *.desktop ); do  egrep "Icon" $i  >> $HOME/scripts/$i; done

I need help copying content from various files to others (same name and format, different path).

For example, $HOME/initial/baby.desktop has text which I need to write into $HOME/scripts/baby.desktop. This is very simple for a single file, but I have 2500 files in $HOME/initial/ and the same number in $HOME/scripts/ with corresponding names (same names and format). I want append (copy) the content of file in path A to path B (which have the same name and format), to the end of file in path B without erase the content of file in path B.

Example content of $HOME/initial/*.desktop to final $HOME/scripts/*.desktop. I tried the following, but it don't work:

cd $HOME/initial/

for i in $( ls *.desktop ); do  egrep "Icon" $i  >> $HOME/scripts/$i; done

原文:https://stackoverflow.com/questions/21790219
更新时间:2024-05-24 20:05

最满意答案

简洁版本

您的数据丢失,并且没有通用的解决方案来恢复原始字符串。

更长的版本

存储数据时应该发生的事情,字符串编码为ISO-8859-1但存储为Unicode UTF8。 这是一个例子:

string orig = "Lernkärtchen";
byte[] iso88891Bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(orig);
// { 76, 101, 114, 110, 107, 228, 114, 116, 99, 104, 101, 110 }
//  'L', 'e', 'r', 'n', 'k', 'ä', 'r', 't', 'c', 'h', 'e', 'n'

当这个数据被传递(仅仅是......)到仅适用于Unicode字符串的数据库时:

string storedValue = Encoding.UTF8.GetString(iso88891Bytes);
byte[] dbData = Encoding.UTF8.GetBytes(storedValue);
// { 76, 101, 114, 110, 107, 239, 191, 189, 114, 116, 99, 104, 101, 110 }
//  'L', 'e', 'r', 'n', 'k',      '�',     'r', 't', 'c', 'h', 'e', 'n'

问题是字节228(11100100二进制)对于utf8无效,因为对于这样一个字节,其他两个字节的值必须大于127.有关详细信息,请参见Wikipedia上的UTF8章节“说明”。

所以会发生的是,以前称为字符'ä'的字节不能被解码为有效的Unicode字符,并且被字节239,191和189替换。这是11101111,10111111和10111101,其导致代码点具有值1111111111111101(0xFFFD)这是你在输出中看到的字符' '。

这个字符就是为了这个目的而使用的。 在Wikipedia Unicode特殊字符页面上,它说:

U +FFFD 用于替换未知或不可代表字符的替换字符

尝试恢复这种变化? 祝你好运。

顺便说一句,Unicode和UTF-8是真棒♥,从来没有使用任何其他☠!


Short version

Your data is lost and there is no general solution how to recover the original strings.

Longer version

What supposedly happened when the data was stored, the strings where encoded as ISO-8859-1 but stored as Unicode UTF8. Here's an example:

string orig = "Lernkärtchen";
byte[] iso88891Bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(orig);
// { 76, 101, 114, 110, 107, 228, 114, 116, 99, 104, 101, 110 }
//  'L', 'e', 'r', 'n', 'k', 'ä', 'r', 't', 'c', 'h', 'e', 'n'

When this data was passed (somehow...) to the database which only works with Unicode strings:

string storedValue = Encoding.UTF8.GetString(iso88891Bytes);
byte[] dbData = Encoding.UTF8.GetBytes(storedValue);
// { 76, 101, 114, 110, 107, 239, 191, 189, 114, 116, 99, 104, 101, 110 }
//  'L', 'e', 'r', 'n', 'k',      '�',     'r', 't', 'c', 'h', 'e', 'n'

The problem is that the byte 228 (11100100 binary) is not valid for utf8 since for such a byte, 2 other bytes must follow which have values > 127. For details, see UTF8 on Wikipedia, chapter "Description".

So what happens is that the byte formerly known as the character 'ä' cannot be decoded into a valid unicode character and is replaced by the bytes 239, 191 and 189. Which is 11101111, 10111111 and 10111101 which results in the code point with value 1111111111111101 (0xFFFD) which is the character '�' you see in your output.

This character is used for exactly that purpose. On Wikipedia Unicode special characters page it says:

U+FFFD � replacement character used to replace an unknown or unrepresentable character

Try to revert that change? Good luck.

Btw, Unicode and UTF-8 are awesome ♥, never use anything else ☠!

相关问答

更多

相关文章

更多

最新问答

更多
  • 谁有JAVA的视频教程啊 最好从基础开始到精通 打包好的
  • 仅根据表格的一列删除重复的值(Remove duplicate values based on only one column of the table)
  • 从数据库值填充数组的最有效方法?(Most efficient way to populate array from database values?)
  • 我在data属性中有一个变量,并从prop传递相同的变量。(I have one variable in data attribute, and pass the same variable from prop. why the {{}} template display the value from prop rather than data)
  • Reg Ex Django Url Conf(Reg Ex Django Url Conf)
  • 本地化不适用于主要活动(Localization does not work on main activity)
  • 我有一个同学让我去福州科闽计算机学校学习室内设计,这个学校好吗?
  • 变量的模板参数中的Decltype(Decltype in template parameter of variable)
  • 代码签名错误与Xcode 4.2(Code Sign error with Xcode 4.2)
  • 在保留表达式生命周期的同时强制执行参数评估顺序(Enforce parameter evaluation order while retaining lifetime of of expressions)
  • 查询以获取早于开始日期的日期行(不是简单的WHERE)(Query to get date rows older than a start date (not a simple WHERE))
  • 武清哪家会计培训机构通过率高?
  • Java正则表达式逻辑OR(Java regex logical OR)
  • 使用for循环创建异步Vue组件(Use for loop to create async Vue component)
  • 从magento数据库获取客户名称和电子邮件(Get customer names and emails from magento database)
  • 运行Ruby冒泡排序(Running Through a Ruby Bubble Sort)
  • 减去返回多于1行的两个SQL子查询的结果(Subtracting the results of the two SQL subqueries that returns more than 1 row)
  • 对于Ruby http请求,响应“呃?”是间歇性的(The response “er ?” is getting intermittently for Ruby http request)
  • 请问嘉定区有没有培训上海市电脑中级的,有的话在哪里?
  • 根据日期范围查询最小分区键(聚类键)(Query min partition key based on date range (clustering key))
  • int listinsert(sqlist * & L)中的&是什么意思
  • 原始套接字的会话管理(Session Management on Raw Socket)
  • 将Type用作“属性”是不好的做法?(Is it bad practise to use Type as a 'Property'?)
  • 电脑等级证书二级和三级有什么区别啊 !!我们学校有二级和三级培训到底该选着那个呢????????
  • linux 安装新硬盘如何设置权限
  • 无法从RecyclerView的onBindViewHolder更改根视图的大小或位置(Cannot change root view's size or position from RecyclerView's onBindViewHolder)
  • spin.js没出现在我的网站上?(spin.js is not showing up on my site?)
  • Gulp + Bootstrap字体(Gulp + Bootstrap Fonts)
  • 如何使用短信Alfresco.util.PopupManager.displayPrompt显示图标(How to show icon with text message Alfresco.util.PopupManager.displayPrompt)
  • Skype.getAllChats()未检测到基于云的群组聊天(Skype.getAllChats() not detecting cloud-based group chats)