标题:请问:怎样能在16进制中查找字符?
取消只看楼主
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
结帖率:70%
已结贴  问题点数:40 回复次数:5 
请问:怎样能在16进制中查找字符?
请问:怎样能在16进制中查找字符?
如:查找“abab

要求:
1、速度快;
2、忽略大小写;

搜索更多相关主题的帖子: 字符 查找 大小写 忽略 进制 
2020-08-11 11:18
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
得分:0 
文件比较大,(2GB以上);
我是用“内存映射方式”将文件分段读取。然后对每一段的内容进行匹配;
关键是:如果采用逐字节的循环搜索,速度都很慢,我之前已经试过很多种算法,效果都不好;

但是,如果用InStrB,则速度很快、很理想;不过InStrB是区分大小写的;
换句话说:要想追求速度,就不能忽略大小写;要想忽略大小写,就不能提高速度;矛盾呀!

我与 WinHex 软件进行对比测试,用一个2.5GB大的文件进行测试。
1、用InStrB(区分大小写)进行搜索,
    我的速度:不到3秒;
    WinHex速度:大概5秒;(用“查找十六进制数值,Ctrl+Alt+X”)
2、用逐字节的循环搜索方式(也用到InStr)进行搜索,
    我的速度:2分钟;慢的过分;(这还是我的多种算法中,最快的一种了)
    WinHex速度:大概6秒;(用“查找文本,Ctrl+F”)

也不知道WinHex是怎么做到的?

[此贴子已经被作者于2020-8-15 10:02编辑过]

2020-08-15 09:47
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
得分:0 
首先,向风版道一声:辛苦!感谢你花时间帮我解困;

对于,2G以上的文件会超出Long范围,我用的是API读文件内容,而不是 Open xxx For Binary As xx
以下是我写的程序:
程序代码:
Option Explicit
Private Type SYSTEM_INFO
    dwOemID                     As Long
    dwPageSize                  As Long
    lpMinimumApplicationAddress As Long
    lpMaximumApplicationAddress As Long
    dwActiveProcessorMask       As Long
    dwNumberOrfProcessors       As Long
    dwProcessorType             As Long
    dwAllocationGranularity     As Long
    wProcessorLevel             As Integer
    wProcessorRevision          As Integer
End Type
Private Declare Sub GetSystemInfo Lib "kernel32" (lpSystemInfo As SYSTEM_INFO)
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long)
Private Declare Sub ZeroMemory Lib "kernel32" Alias "RtlZeroMemory" (lpDst As Any, ByVal Length As Long)

Private Declare Function CloseHandle Lib "kernel32" (ByVal hObject As Long) As Long
Private Declare Function CreateFile Lib "kernel32" Alias "CreateFileA" (ByVal lpFileName As String, ByVal dwDesiredAccess As Long, ByVal dwShareMode As Long, ByVal lpSecurityAttributes As Long, ByVal dwCreationDisposition As Long, ByVal dwFlagsAndAttributes As Long, ByVal hTemplateFile As Long) As Long
Private Const GENERIC_READ = &H80000000
Private Const FILE_SHARE_READ = &H1
Private Const FILE_SHARE_WRITE = &H2
Private Const FILE_SHARE_DELETE = &H4

Private Const OPEN_EXISTING = 3

Private Const FILE_ATTRIBUTE_NORMAL = &H80
Private Const FILE_FLAG_SEQUENTIAL_SCAN = &H8000000

Private Declare Function GetFileSize Lib "kernel32" (ByVal hFile As Long, lpFileSizeHigh As Long) As Long

'---  文件映射
Private Declare Function CreateFileMapping Lib "kernel32" Alias "CreateFileMappingA" (ByVal hFile As Long, ByVal lpFileMappigAttributes As Long, ByVal flProtect As Long, ByVal dwMaximumSizeHigh As Long, ByVal dwMaximumSizeLow As Long, ByVal lpName As String) As Long
Private Declare Function MapViewOfFile Lib "kernel32" (ByVal hFileMappingObject As Long, ByVal dwDesiredAccess As Long, ByVal dwFileOffsetHigh As Long, ByVal dwFileOffsetLow As Long, ByVal dwNumberOfBytesToMap As Long) As Long
Private Declare Function UnmapViewOfFile Lib "kernel32" (ByVal lpBaseAddress As Long) As Long
Private Const PAGE_READONLY = &H2
Private Const FILE_MAP_READ = &H4

Private Type LARGE_INTEGER
    LowPart  As Long
    HighPart As Long
End Type
Private fhWnd       As Long, AllocationGranularity As Long
Private mFileSize   As Currency
Private Buffer()    As Byte

Private Sub Form_Load()
    Dim SysInfo As SYSTEM_INFO
    Call GetSystemInfo(SysInfo)
    AllocationGranularity = SysInfo.dwAllocationGranularity
End Sub

Private Sub Form_Unload(Cancel As Integer)
    Erase Buffer
End Sub

Private Sub Command1_Click()
    Dim FileName    As String, FindStr  As String
    Dim Le          As Long, Pos        As Long, P As Long
    Dim FindPos     As Currency, Start  As Currency
    Dim bFind()     As Byte
Dim TTT     As Single
TTT = Timer
    
    FileName = "C:\xxx\xxx\xxx.yyy"      '文件名(包含路径)
    If Dir(FileName) = "" Then Exit Sub
'--------------------------------------------------------------
    FindStr = "FFAABBCC"                  '查找字节 字符串
    Le = Len(FindStr)
    If Le Mod 2 = 1 Then Le = Le + 1
    Le = Le \ 2 - 1
    ReDim bFind(Le) As Byte
    Pos = 1
    For P = 0 To Le
        bFind(P) = Val("&H" & Mid(FindStr, Pos, 2))
        Pos = Pos + 2
    Next
'--------------------------------------------------------------
    
    fhWnd = OpenFile(FileName)                  '打开文件
    mFileSize = GetFileSizeAPI(fhWnd)           '获得文件大小
Debug.Print "文件大小 = " & FormatNumber(mFileSize, 0, , , vbTrue) & " 字节"

    ReDim Buffer(AllocationGranularity - 1) As Byte     '注意:缓冲区大小必须是 AllocationGranularity,或 AllocationGranularity的整数倍

'FindByte 函数返回查找字节位置,-1表示没有匹配;
'Start 参数:表示查找起始位置,0表示从头开始;
    Start = 0
    FindPos = FindByte(bFind, Start)    '查找
    
    Call CloseHandle(fhWnd) '关闭文件
    Erase bFind
Debug.Print "用时 = " & (Timer - TTT) * 1000 & " 毫秒;  " & "查找位置 = " & FindPos
End Sub

Private Function FindByte(ByteFind() As Byte, ByVal Start As Currency) As Currency
    Dim fMaphWnd    As Long, MapByteSum As Long, FindLen As Long, bStrPtr As Long, Start2 As Long
    Dim fSize       As Currency, Offset As Currency
    Dim Follow      As Boolean
    Dim bStrand()   As Byte

    FindLen = UBound(ByteFind)
    ReDim bStrand(FindLen * 2 - 1) As Byte
    bStrPtr = VarPtr(bStrand(0))
    
    MapByteSum = AllocationGranularity
    Offset = Int(Start / AllocationGranularity) * AllocationGranularity
    Start = Start - Offset + 1
    If MapByteSum - Start < FindLen Then Start2 = FindLen - (MapByteSum - Start) Else Start2 = 1
    fSize = mFileSize - Offset
    fMaphWnd = OpenFileMapping(fhWnd)
    Do
        If MapByteSum > fSize Then
            MapByteSum = fSize
            Call ZeroMemory(Buffer(0), AllocationGranularity)
        End If
        Call ReadFileMapping(fMaphWnd, Offset, MapByteSum, Buffer)
        If Follow = True Then
            Call CopyMemory(bStrand(FindLen), Buffer(0), FindLen)
            FindByte = InStrB(Start2, bStrand, ByteFind) - 1
            If FindByte > -1 Then
                FindByte = Offset - FindLen + FindByte
                Exit Do
            End If
            Start2 = 1
        End If
        FindByte = InStrB(Start, Buffer, ByteFind) - 1
        If FindByte > -1 Then
            FindByte = Offset + FindByte
            Exit Do
        End If
        If fSize > MapByteSum Then
            Call CopyMemory(ByVal bStrPtr, Buffer(MapByteSum - FindLen), FindLen)
            Follow = True
        End If
        Offset = Offset + AllocationGranularity
        fSize = fSize - MapByteSum
        Start = 1
    Loop Until fSize = 0
    Call CloseHandle(fMaphWnd)   '关闭文件映射
    Erase bStrand
End Function

Private Function OpenFile(ByVal FileName As String) As Long         '打开文件
    Dim ShareMode As Long
    ShareMode = FILE_SHARE_READ Or FILE_SHARE_WRITE Or FILE_SHARE_DELETE
    OpenFile = CreateFile(FileName, GENERIC_READ, ShareMode, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL Or FILE_FLAG_SEQUENTIAL_SCAN, 0)
End Function

Private Function GetFileSizeAPI(ByVal FilehWnd As Long) As Currency '文件大小;字节
    Dim fLo As Long, fHi As Long
    fLo = GetFileSize(FilehWnd, fHi)
    GetFileSizeAPI = HighLowToSize(fLo, fHi)
End Function

Private Function OpenFileMapping(ByVal FilehWnd As Long, Optional ByVal FileSize As Currency = 0) As Long '打开文件映射
    Dim fLo As Long, fHi As Long
    If FileSize > 0 Then Call SizeToHighLow(FileSize, fLo, fHi)
    OpenFileMapping = CreateFileMapping(FilehWnd, 0, PAGE_READONLY, fHi, fLo, vbNullString)
End Function

Private Function ReadFileMapping(ByVal MapFilehWnd As Long, ByVal Offset As Currency, ByVal ViewSize As Long, ByRef Buffer() As Byte) As Boolean
    Dim MapMemPtr As Long, fLo As Long, fHi As Long
    If Offset > 0 Then Call SizeToHighLow(Offset, fLo, fHi)
    MapMemPtr = MapViewOfFile(MapFilehWnd, FILE_MAP_READ, fHi, fLo, ViewSize)
    If MapMemPtr > 0 Then
        Call CopyMemory(Buffer(0), ByVal MapMemPtr, ViewSize)
        Call UnmapViewOfFile(MapMemPtr)
        ReadFileMapping = True
    End If
End Function

Private Function HighLowToSize(ByVal LowLong As Long, ByVal HighLong As Long) As Currency
    Dim LI As LARGE_INTEGER
    With LI
        .LowPart = LowLong
        .HighPart = HighLong
    End With
    Call CopyMemory(HighLowToSize, LI, Len(LI))
    HighLowToSize = HighLowToSize * 10000
End Function

Private Sub SizeToHighLow(ByVal FileSize As Currency, ByRef LowLong As Long, ByRef HighLong As Long)
    Dim LI As LARGE_INTEGER
    Call CopyMemory(LI, CCur(FileSize / 10000), Len(LI))
    With LI
        LowLong = .LowPart
        HighLong = .HighPart
    End With
End Sub


关于上述代码的几点说明:
1、给“FileName”变量指定路径文件名,(可以指定大于2G的文件);
2、给“FindStr”变量指定搜索关键字;注意:格式是字节。如:“FFAABBCC”
3、上述代码仅限于字节方式的查找,也就是说,是区分大小写的;对于文本方式(忽略大小写) ,我写不出来,(或者说,我的写速度极慢)
2020-08-16 09:34
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
得分:0 
回复:
1、大于 4.1G 的文件不会超出。甚至大于10G也不会超出。因为GetFileSize函数的lpFileSizeHigh参数接收文件大小的高Long是计数值,不是真正意义上的文件大小值(至少我这么理解),例如,当文件小于2^32(4294967296)时,lpFileSizeHigh参数为0;当文件大于(4294967296)时,lpFileSizeHigh参数为1;当文件大于(4294967296 * 2)时,lpFileSizeHigh参数为2;依此类推。

2、关于InStrB 返回值超过Long范围?我没试过,我在6楼的代码是不可能让它超出的,因为每次只对缓冲区内的字节查找;而该缓冲区(Buffer)最大也不会超出Long范围;注:我定义的缓冲区是SysInfo.dwAllocationGranularity(65536)

3、我没有大量地使用 Currency ,,只是在以下情况下使用Currency:
    A、获得文件大小时使用;
    B、设置查找起始位置时使用;而且在“FindByte”过程中进行了处理“Start = Start - Offset + 1”,这样使得Start 的值不会太大!
    C、做为偏移地址(Offset)使用,是为了传递给MapViewOfFile函数,不会伤到VB内置函数的

4、我跑一遍你修改的程序,感觉是有问题的;你只对 bFind2 进行了大小互转;可是缓冲区的内容却没有变。这样有漏掉匹配的情况:
例如:bFind的值是“aBaB”;bFind2的值是“AbAb”;如果缓冲区(Buffer字节数组)里有“ABab”,会漏掉的;
我之前写过这种思路,不过是把缓冲区的所有的在97~122范围的字节都要转大写,这样就很费时了。
2020-08-16 16:00
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
得分:0 
啊!好吧,我这样解释:
我那个“FindStr = "FFAABBCC"”,只是举例,好让大家注意格式:其目的是想说,只能输入字节,也就是说,你只能输入00 ~ FF之间的数,而且每两位(两个字符)作为一个字节,"FFAABBCC"表示4个字节,至于你想输入什么,就输入什么,只要格式对了就行。

再举例说:(看1楼的图)
你可以输入 FindStr = "61426142",由你来定,我想我说明白了!
就像使用WinHex 软件,在搜索框中输入的一样;
2020-08-16 16:50
William1949
Rank: 3Rank: 3
等 级:新手上路
威 望:8
帖 子:109
专家分:0
注 册:2009-3-17
得分:0 
不会漏掉????

好吧~ 你说不会漏掉,就不会漏掉吧。

你不实测,我无语了!

我经过实际测试,发现有遗漏现象,而你却说“不会漏掉”,我本是发贴求助的,不想在这事上掰扯不清。

结了。
2020-08-17 09:05



参与讨论请移步原网站贴子:https://bbs.bccn.net/thread-502811-1-1.html




关于我们 | 广告合作 | 编程中国 | 清除Cookies | TOP | 手机版

编程中国 版权所有,并保留所有权利。
Powered by Discuz, Processed in 0.122135 second(s), 9 queries.
Copyright©2004-2024, BCCN.NET, All Rights Reserved