标题:在VFP中如何应用SAPI实现离线的语音转文字?
只看楼主
jlliushi
Rank: 2
等 级:论坛游民
帖 子:33
专家分:10
注 册:2021-12-24
结帖率:100%
已结贴  问题点数:30 回复次数:6 
在VFP中如何应用SAPI实现离线的语音转文字?
在VFP中如何应用SAPI实现离线的语音转文字?
SAPI的两种基本类型是文本合成语音(TTS)引擎和语音识别(SR)引擎。TTS可以将文本中的字符或文档合成为语音并“说”出来。而SR则可以把人说话的语音转换为易读的字符或文档。
TTS的实现本论坛中已有讨论,烦请大神给出一个用VFP实现语音识别(SR)的代码。
搜索更多相关主题的帖子: 应用 字符 语音 文字 VFP 
2021-12-24 13:47
jlliushi
Rank: 2
等 级:论坛游民
帖 子:33
专家分:10
注 册:2021-12-24
得分:0 
网上找到了一篇用VB实现的文章,供大神们参考。。。

*在 VB中应用 SAPI实现语音录入
Private Sub Form_Load()
    Dim WithEvents RC As SpSharedRecoContext
    Dim myGrammar As ISpeechRecoGrammar
    Dim sRecoString As String
End Sub

Private Sub Commandl_Click()
  Set RC = New SpSharedRecoContext
  Set myGrammar = RC.CreateGrammar
  myGrammar.DictationSetState SGDSActive
EndSub
Private Sub RC_Recognition(ByVal StreamNumber As Long, ByVal StreamPosition As Variant, ByVal RecognitionType As SpeechLib.SpeechRec0gniti0nType, ByVal Result As SpeechLib.ISpeechRecoResuh)
  Dim sText As String
  sRecoString = Result.PhraseInfo.GetText
  Text1.Text = Text1.Text & sRecoString
End Sub

Private Sub Form_Unload(Cancel As Integer)
  Set RC = Nothing
  Set myGrammar = Nothing
End Sub

[此贴子已经被作者于2021-12-25 06:44编辑过]

2021-12-25 06:35
schtg
Rank: 11Rank: 11Rank: 11Rank: 11
来 自:https://t.me/pump_upp
等 级:贵宾
威 望:67
帖 子:1355
专家分:2534
注 册:2012-2-29
得分:0 
这个很好!
2021-12-25 06:39
jlliushi
Rank: 2
等 级:论坛游民
帖 子:33
专家分:10
注 册:2021-12-24
得分:0 
这是C的代码,供大神们参考。。。
语音识别编程涉及IspRecognizer,IspRecoContext和ISpRecoGrammar等多个语音识别引擎接口。下面先来设计一个操作语音识别的类CSpeechRecognition,然后基于该类来实现一个实例。

11.3.1  构造CSpeechRecognition类
CSpeechRecognition类封装了语音识别操作所需调用的几个接口,使用它进行语音识别编程很方便,也很简洁。
CSpeechRecognition类的定义如下:
///////////////////////////////////////////////////////////////
// active speech engine
#include <atlbase.h>
extern CComModule _Module;
#include <atlcom.h>
#include <sapi.h>
#include <sphelper.h>
#include <spuihelp.h>
///////////////////////////////////////////////////////////////
// speech message
#define WM_SREVENT    WM_USER+102
 
class CSpeechRecognition  
{
public:
    CSpeechRecognition();
    virtual ~CSpeechRecognition();
 
    // initialize
    BOOL Initialize(HWND hWnd = NULL, BOOL bIsShared = TRUE);
    void Destroy();
 
    // start and stop
    BOOL Start();
    BOOL Stop();
    BOOL IsDictationOn()
    {
        return m_bOnDictation;
    }
 
    // event handler
    void GetText(WCHAR **ppszCoMemText, ULONG ulStart = 0, ULONG nlCount = -1);
 
    // voice training
    HRESULT VoiceTraining(HWND hWndParent);
 
    // microphone setup
    HRESULT MicrophoneSetup(HWND hWndParent);
 
    // token list
    HRESULT InitTokenList(HWND hWnd, BOOL bIsComboBox = FALSE);
 
    // error string
    CString GetErrorString()
    {
        return m_sError;
    }
 
    // interface
        CComPtr<ISpRecognizer> m_cpRecoEngine;    // SR engine
        CComPtr<ISpRecoContext> m_cpRecoCtxt;    //Recognition contextfor dictation
        CComPtr<ISpRecoGrammar> m_cpDictationGrammar;  // Dictation grammar
 
private:
    CString m_sError;
    BOOL    m_bOnDictation;
};
其中定义的消息WM_SREVENT用于指示语音识别事件,该消息将通知到初始化函数指定的响应窗口。
类中定义了3个接口指针m_cpRecoEngine,m_cpRecoCtxt和m_cpDictationGrammar,分别用于引用语音识别引擎的3个重要接口IspRecognizer,ISpRecoContext和IspRecoGrammar。
初始化函数Initialize设定了语音识别引擎的基本工作环境,包括引擎、识别上下文、语法、音频和事件等的初始化:
BOOL CSpeechRecognition::Initialize(HWND hWnd, BOOL bIsShared)
{
    // com library
    if (FAILED(CoInitialize(NULL)))
    {
        m_sError=_T("Error intialization COM");
        return FALSE;
    }
 
    // SR engine
    HRESULT hr = S_OK;
    if (bIsShared)
    {
        // Shared reco engine.
        // For a shared reco engine, the audio gets setup automatically
        hr = m_cpRecoEngine.CoCreateInstance( CLSID_SpSharedRecognizer );
    }
    else
    {
        hr = m_cpRecoEngine.CoCreateInstance(CLSID_SpInprocRecognizer);
 
    }
 
    // RecoContext
    if( SUCCEEDED( hr ) )
    {
        hr = m_cpRecoEngine->CreateRecoContext( &m_cpRecoCtxt );
    }
 
    // Set recognition notification for dictation
    if (SUCCEEDED(hr))
    {
  hr = m_cpRecoCtxt->SetNotifyWindowMessage( hWnd, WM_SREVENT, 0, 0 );
    }
   
    if (SUCCEEDED(hr))
    {
        // when the engine has recognized something
        const ULONGLONG ullInterest = SPFEI(SPEI_RECOGNITION);
        hr = m_cpRecoCtxt->SetInterest(ullInterest, ullInterest);
    }
 
    // create default audio object
    CComPtr<ISpAudio> cpAudio;
    hr = SpCreateDefaultObjectFromCategoryId(SPCAT_AUDIOIN, &cpAudio);
 
    // set the input for the engine
    hr = m_cpRecoEngine->SetInput(cpAudio, TRUE);
    hr = m_cpRecoEngine->SetRecoState( SPRST_ACTIVE );
 
    // grammar
    if (SUCCEEDED(hr))
    {
        // Specifies that the grammar we want is a dictation grammar.
        // Initializes the grammar (m_cpDictationGrammar)
        hr = m_cpRecoCtxt->CreateGrammar( 0, &m_cpDictationGrammar );
    }
    if  (SUCCEEDED(hr))
    {hr = m_cpDictationGrammar->LoadDictation(NULL, SPLO_STATIC);
    }
    if (SUCCEEDED(hr))
    {
        hr = m_cpDictationGrammar->SetDictationState( SPRS_ACTIVE );
    }
    if (FAILED(hr))
    {
        m_cpDictationGrammar.Release();
    }
 
    return (hr == S_OK);
}
 
释放函数Destroy被类的析构函数调用,释放了类所引用的所有接口:
void CSpeechRecognition::Destroy()
{
    if (m_cpDictationGrammar)
        m_cpDictationGrammar.Release();
    if (m_cpRecoCtxt)
        m_cpRecoCtxt.Release();
    if (m_cpRecoEngine)
        m_cpRecoEngine.Release();
    CoUninitialize();
}
函数Start和Stop用来控制开始和停止接受及识别语音,它们通过调用引擎接口的SetRecoState方法来实现:
BOOL CSpeechRecognition::Start()
{
    if (m_bOnDictation)
        return TRUE;
 
          HRESULT hr = m_cpRecoEngine->SetRecoState( SPRST_ACTIVE );
    if (FAILED(hr))
    return FALSE;
 
    m_bOnDictation = TRUE;
    return TRUE;
}
 
BOOL CSpeechRecognition::Stop()
{
    if (! m_bOnDictation)
        return TRUE;
 
       HRESULT hr = m_cpRecoEngine->SetRecoState( SPRST_INACTIVE );
    if (FAILED(hr))
return FALSE;
 
    m_bOnDictation = FALSE;
    return TRUE;
}
函数GetText是获取从语音中已识别出的文字的关键,应该在响应识别事件/消息的响应函数中调用,其代码如下所示。
void CSpeechRecognition::GetText(WCHAR **ppszCoMemText, ULONG ulStart, ULONG nlCount)
{
    USES_CONVERSION;
    CSpEvent event;
 
    // Process all of the recognition events
    while (event.GetFrom(m_cpRecoCtxt) == S_OK)
    {
        switch (event.eEventId)
        {
            case SPEI_RECOGNITION:
       // There may be multiple recognition results, so get all of them
                {
                    HRESULT hr = S_OK;
                    if (nlCount == -1)
                event.RecoResult()->GetText(SP_GETWHOLEPHRASE,
SP_GETWHOLEPHRASE, TRUE, ppszCoMemText, NULL);
                    else
                    {
                    ASSERT(nlCount > 0);
                    event.RecoResult()->GetText(ulStart, nlCount, FALSE,
    ppszCoMemText, NULL);
                    }
                }
                break;
        }
    }
}
函数InitTokenList调用SpInitTokenComboBox和SpInitTokenListBox函数来实现语音语言在列表或组合列表中的列表显示和选择:
HRESULT CSpeechRecognition::InitTokenList(HWND hWnd, BOOL bIsComboBox)
{
    if (bIsComboBox)
        return SpInitTokenComboBox(hWnd, SPCAT_RECOGNIZERS);
    else
        return SpInitTokenListBox(hWnd, SPCAT_RECOGNIZERS);
}
语音识别涉及语音的输入,通常用话筒来输入语音。进行语音识别前,需要判断话筒的位置和设置是否合理,以保证语音识别引擎能获得有效的语音输入。函数MicrophoneSetup调用语音识别引擎接口的DisplayUI方法来显示一个设置话筒的向导,如图11-4所示。示例代码如下所示:
HRESULT CSpeechRecognition::MicrophoneSetup(HWND hWndParent)
{
    return m_cpRecoEngine->DisplayUI(hWndParent, NULL, SPDUI_MicTraining, NULL, 0);
}
语音训练是语音识别的重要基础,为了获得期望的识别效果,必须进行语音训练,以让语音识别引擎熟悉说话者的口音。函数VoiceTraining调用语音识别引擎接口的DisplayUI方法来显示一个语音训练向导,如图11-5所示。示例代码如下所示:
HRESULT CSpeechRecognition::VoiceTraining(HWND hWndParent)
{
    return m_cpRecoEngine->DisplayUI(hWndParent, NULL, SPDUI_UserTraining, NULL, 0);
}
与CText2Speech类似,CSpeechRecognition类也提供错误处理机制,由GetErrorString函数可以获得错误信息。
11.3.2  示例:用CSpeechRecognition类编制听写程序
使用CSpeechRecognition类来编写语音识别程序很简单,下面让我们实现一个听写程序Stenotypist,其界面如图11-6所示。
用VisualC++编制Stenotypist的步骤和要点如下:
1)使用AppWizard生成一个基于对话框的项目Stenotypist;
2)将SpeechRecognition.H,SpeechRecognition.CPP增加到Stenotypist项目中;
3)在资源编辑器中编辑好响应的控件;
4)用ClassWizard为控件在CStenotypistDlg 类中生成相应的成员;
5)修改StenotypistDlg.h文件,为类CStenotypistDlg增加相应的变量和函数;
6)用ClassWizard为CStenotypistDlg 类添加对控件和消息的响应函数。StenotypistDlg.h的代码如下。
#include "SpeechRecognition.h"
 
////////////////////////////////////////////////////////////////////
// CStenotypistDlg dialog
 
class CStenotypistDlg : public CDialog
{
// Construction
public:
    CStenotypistDlg(CWnd* pParent = NULL);    // standard constructor
 
// Dialog Data
    //{{AFX_DATA(CStenotypistDlg)
    enum { IDD = IDD_STENOTYPIST_DIALOG };
    CButton    m_btDictation;
    CString    m_strText;
    //}}AFX_DATA
 
    // ClassWizard generated virtual function overrides
    //{{AFX_VIRTUAL(CStenotypistDlg)
    protected:
    virtual void DoDataExchange(CDataExchange* pDX);    // DDX/DDV support
    //}}AFX_VIRTUAL
 
    CSpeechRecognition    m_SpeechRecognition;
 
// Implementation
protected:
    HICON m_hIcon;
 
    // Generated message map functions
    //{{AFX_MSG(CStenotypistDlg)
    virtual BOOL OnInitDialog();
    afx_msg void OnSysCommand(UINT nID, LPARAM lParam);
    afx_msg void OnPaint();
    afx_msg HCURSOR OnQueryDragIcon();
    afx_msg void OnButtonVt();
    afx_msg void OnButtonMs();
    afx_msg void OnButtonDictate();
    //}}AFX_MSG
    afx_msg LRESULT OnSREvent(WPARAM, LPARAM);
    DECLARE_MESSAGE_MAP()
};
注意,在CStenotypistDlg类中定义了一个CSpeechRecognition类的对象。
在OnInitDialog函数中调用CSpeechRecognition的初始化函数和设置语音语言列表:
BOOL CStenotypistDlg::OnInitDialog()
{
    CDialog::OnInitDialog();
 
    // Add "About..." menu item to system menu.
 
    // IDM_ABOUTBOX must be in the system command range.
    ASSERT((IDM_ABOUTBOX & 0xFFF0) == IDM_ABOUTBOX);
    ASSERT(IDM_ABOUTBOX < 0xF000);
 
    CMenu* pSysMenu = GetSystemMenu(FALSE);
    if (pSysMenu != NULL)
    {
        CString strAboutMenu;
        strAboutMenu.LoadString(IDS_ABOUTBOX);
        if (!strAboutMenu.IsEmpty())
        {
            pSysMenu->AppendMenu(MF_SEPARATOR);
            pSysMenu->AppendMenu(MF_STRING, IDM_ABOUTBOX, strAboutMenu);
        }
    }
 
    // Set the icon for this dialog.  The framework does this automatically
    //  when the application's main window is not a dialog
    SetIcon(m_hIcon, TRUE);            // Set big icon
    SetIcon(m_hIcon, FALSE);        // Set small icon
   
    // TODO: Add extra initialization here
    if (! m_SpeechRecognition.Initialize(m_hWnd))
AfxMessageBox(m_SpeechRecognition.GetErrorString());
m_SpeechRecognition.InitTokenList(GetDlgItem(IDC_LIST1)->m_hWnd);
 
    m_SpeechRecognition.Stop();
   
    return TRUE;  // return TRUE  unless you set the focus to a control
}
开始听写和停止听写的实现较简单,只需调用CSpeechRecognition类的响应函数就能实现,其代码如下所示。注意,停止和开始是互相切换的。
void CStenotypistDlg::OnButtonDictate()
{
    if (m_SpeechRecognition.IsDictationOn())
    {
        m_SpeechRecognition.Stop();
        m_btDictation.SetWindowText("听写(&D)");
 
        SetWindowText("听写者 - 请按<听写>按钮开始听写!");
    }
    else
    {
        m_SpeechRecognition.Start();
        m_btDictation.SetWindowText("停止(&S)");
 
        SetWindowText("听写者 - 正在记录,请口述...");
    }
}
设置话筒和语音训练也通过直接调用CSpeechRecognition类的成员函数来实现:
void CStenotypistDlg::OnButtonVt()
{    m_SpeechRecognition.VoiceTraining(m_hWnd);
}
 void CStenotypistDlg::OnButtonMs()
{    m_SpeechRecognition.MicrophoneSetup(m_hWnd);
}
为了响应消息WM_SREVENT,需要添加相应的消息响应函数:
BEGIN_MESSAGE_MAP(CStenotypistDlg, CDialog)
    //{{AFX_MSG_MAP(CStenotypistDlg)
    ON_WM_SYSCOMMAND()
    ON_WM_PAINT()
    ON_WM_QUERYDRAGICON()
    ON_BN_CLICKED(IDC_BUTTON_VT, OnButtonVt)
    ON_BN_CLICKED(IDC_BUTTON_MS, OnButtonMs)
    ON_BN_CLICKED(IDC_BUTTON_DICTATE, OnButtonDictate)
    //}}AFX_MSG_MAP
    ON_MESSAGE(WM_SREVENT, OnSREvent)
END_MESSAGE_MAP()
 
LRESULT CStenotypistDlg::OnSREvent(WPARAM, LPARAM)
{    WCHAR *pwzText;
    m_SpeechRecognition.GetText(&pwzText);
   
    m_strText += CString(pwzText);
    UpdateData(FALSE);
 
    return 0L;
}
7)为了调用Speech引擎,应该在Microsoft Visual C++编程环境中设置好相应的include和lib设置:
① 设置include路径
●    通过Project→Settings菜单项打开Project Settings对话框;
●    点击C/C++项;
●    在Category下拉列表中选取Preprocessor;
●    在“Additional include directories”编辑框中输入安装Speech SDK的include的路径,默认的路径是C:\Program Files\Microsoft Speech SDK 5.1\Include。
② 设置lib信息
●    通过Project→Settings菜单项打开Project Settings对话框;
●    选择Link项;
●    在Category下拉列表中选取Input项;
●    在“Additional library path”编辑框中输入安装Speech SDK的lib的路径,默认的路径是C:\Program Files\Microsoft Speech SDK 5.1\ Lib\i386;
●    将“sapi.lib”输入“Object/library modules”所标识的编辑框中。
8)编译连接该项目,就可让听写者开始听写了。
Stenotypist项目的所有源代码都存放在附盘的\Source\Stenotypist目录下。
2021-12-25 07:09
jlliushi
Rank: 2
等 级:论坛游民
帖 子:33
专家分:10
注 册:2021-12-24
得分:0 
这是用VB2010实现的代码和分析,供参考。哪位大神给改写一下,弄个VFP代码的···
http://wenku.baidu.com/view/80f9040d763231126edb1139.html

Option Explicit On
Imports SpeechLib
Public Class Form1
Public WithEvents RC As SpSharedRecoContext
Dim Recognizer As SpInprocRecognizer
Public myGrammar, b As ISpeechRecoGrammar
Dim i, j As Boolean

Private Sub Form1_Load(sender As System.object, e As System.EventArgs)
Handles MyBase.Load
RC = New SpSharedRecoContext
Dim SharedRecognizer As SpSharedRecognizer
SharedRecognizer = CreateObject("SAPI.SpSharedRecognizer")
myGrammar = RC.CreateGrammar()
Call myGrammar.CmdLoadFromFile("sol.xml", 0)
myGrammar.CmdSetRuleIdState(0, SpeechRuleState.SGDSActive)

RC.Voice.Speak ("now system started")
End Sub

Private Sub RC_FalseRecognition(ByVal StreamNumber As Long, ByVal StreamPosition As Object, ByVal Result As SpeechLib.ISpeechRecoResult)
TextBox1.Text = "(no recognition)"
End Sub

Private Sub RC_Recognition(ByVal StreamNumber As Long, ByVal StreamPosition As Object, ByVal Result As SpeechLib.ISpeechRecoResult)
If i = True Then
TextBox1.Text = Result.PhraseInfo.GetText
RC.Voice.Speak ("now i'am listening your command")
Select Case Result.PhraseInfo.GetText
Case "start"
MsgBox ("现在开始运行程序")
Case "stop"
MsgBox ("这是我编写的第一个语音程序,好高兴哦!")
Case "net"
Shell ("C:/Program Files/Internet Explorer/IEXPLORE.EXE")
Case "结束"
End
End Select
End If
End Sub

Private Sub RC_StartStream(ByVal StreamNumber As Long, ByVal StreamPosition As Object)
TextBox1.Text = Val(StreamNumber)
End Sub

Private Sub RC_Recognition(ByVal StreamNumber As Long, ByVal StreamPosition As Object, ByVal RecognitionType As SpeechLib.SpeechRecognitionType, ByVal Result As SpeechLib.ISpeechRecoResult)
If j = True Then
TextBox1.Text = Result.PhraseInfo.GetText
End If
End Sub

Private Sub Button1_Click(ByVal sender As System.object, ByVal e As System.EventArgs)
Handles Button1.Click
Dim strData As String
strData = StrConv(TextBox1.Text, VbStrConv.SimplifiedChinese, 2052)
RC.Voice.Speak (strData)
End Sub
Private Sub RadioButton1_CheckedChanged(sender As System.object, e As System.EventArgs)
Handles RadioButton1.CheckedChanged
i = True
j = False
End Sub

Private Sub RadioButton2_CheckedChanged(sender As System.object, e As System.EventArgs)
Handles RadioButton2.CheckedChanged
j = True
i = False
End Sub
End Class <GRAMMAR LANGID="409">
<DEFINE>
<ID NAME="RID_NewGame" VAL="101"/>
</DEFINE> <RULE NAME="newgame" ID="RID_NewGame" TOPLEVEL="ACTIVE">
<L>
<P>start</P>
<P>net</P>
<P>end</P>
<P>stop</P>
<P> </P>
</L>
</RULE>
</GRAMMAR>
2021-12-26 04:53
吹水佬
Rank: 20Rank: 20Rank: 20Rank: 20Rank: 20
等 级:版主
威 望:432
帖 子:10064
专家分:41463
注 册:2014-5-20
得分:30 
Microsoft Speech API (SAPI)
https://docs.(v=vs.85)

SAPI Application Object Classes
https://docs.(v=vs.85)
2021-12-26 07:28
jlliushi
Rank: 2
等 级:论坛游民
帖 子:33
专家分:10
注 册:2021-12-24
得分:0 
今天搜到了一个VBS的代码,经测试可用。。。
'==========================================================================
' Name : CommandPC.VBS
' AUTHOR : HUAYING
' DATE : 2005-1-31
'==========================================================================
Dim CommandDictionary '命令字典对象
Dim WshShell 'WshShell对象提供对本地Windows程序的访问。
Dim ScriptComplete '程序结束标志
Dim SR '语音识别(Speech Recognition)对象
Dim Grammar '语音识别的命令语法对象

'初始化命令字典对象,可根据自己的需要添加命令
Set CommandDictionary = CreateObject("Scripting.Dictionary")
CommandDictionary.Add "上网","""C:\Program Files\Internet Explorer\iexplore.exe""" '注意双引号的数目
CommandDictionary.Add "计算器", "calc"
CommandDictionary.Add "记事本", "notepad"
CommandDictionary.Add "空当接龙", "freecell"
Set WshShell = CreateObject("WScript.Shell") '创建WshShell对象
ScriptComplete = False '初始化程序结束标志

'创建语音识别对象,调用由"Command.XML"所定义的语法,并启动语音识别引擎
Set SR = WScript.CreateObject("SAPI.SpSharedRecoContext", "RecoContext_")
Set Grammar = SR.CreateGrammar
Grammar.CmdLoadFromFile "Command.xml", SLODynamic
Grammar.CmdSetRuleIdState 0, 1

MsgBox "你好,主人,请吩咐。"
'等候你的语音命令(需要安装麦克风)
'当识别出"命令结束"命令时程序结束
Do
   WScript.Sleep 1000
Loop Until ScriptComplete
MsgBox "欢迎再跟我说话,再见!"
 
'你的语音命令被识别
Sub RecoContext_Recognition(ByVal StreamNumber,ByVal StreamPosition,ByVal RecognitionType,ByVal Result )
    Text = Result.PhraseInfo.GetText '获取语音识别引擎所识别的命令
    If Text <> "命令结束" Then
       'WshShell.Run CommandDictionary.Item(Text) '由WshShell对象Run方法执行你的命令
        MsgBox Text
     Else
       ScriptComplete = true '程序结束标志
    End If
End Sub
2021-12-27 17:21



参与讨论请移步原网站贴子:https://bbs.bccn.net/thread-507979-1-1.html




关于我们 | 广告合作 | 编程中国 | 清除Cookies | TOP | 手机版

编程中国 版权所有,并保留所有权利。
Powered by Discuz, Processed in 0.053125 second(s), 8 queries.
Copyright©2004-2024, BCCN.NET, All Rights Reserved