Remove Duplicate Lines From Text File?(从文本文件中删除重复行?)
问题描述
给定一个文本行的输入文件,我希望识别和删除重复的行.请展示一个简单的 C# 片段来完成此操作.
Given an input file of text lines, I want duplicate lines to be identified and removed. Please show a simple snippet of C# that accomplishes this.
推荐答案
应该这样做(并且会复制大文件).
This should do (and will copy with large files).
注意它只删除重复的连续行,即
Note that it only removes duplicate consecutive lines, i.e.
最终会变成
如果您不想在任何地方重复,则需要保留一组您已经看过的行.
If you want no duplicates anywhere, you'll need to keep a set of lines you've already seen.
请注意,这假定为 Encoding.UTF8
,并且您要使用文件.不过,它很容易概括为一种方法:
Note that this assumes Encoding.UTF8
, and that you want to use files. It's easy to generalize as a method though:
(请注意,这不会关闭任何东西 - 调用者应该这样做.)
(Note that that doesn't close anything - the caller should do that.)
以下版本将删除所有个重复项,而不仅仅是连续的:
Here's a version that will remove all duplicates, rather than just consecutive ones:
这篇关于从文本文件中删除重复行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!