This module discussed how and when to use full-text searching.
Full-text searching was a very powerful new tool in SQL Server 7 that allowed you to search rapidly on very large text fields.
Having completed this module, you should be able to:
- Describe what full-text searching is and when to use
- Describe considerations for full-text searching
- Practice implementing full-text searching
In the next module, you will learn how cursors can help you manipulate data in SQL Server record sets.
Full-Text Search in SQL Server has evolved from 2012 to 2022, with improvements in functionality, performance, and integration. Below is a concise overview of the key changes and enhancements:
SQL Server 2012
- Semantic Search: Introduced statistical semantic search, enabling deeper document analysis by extracting key phrases and identifying similar documents using T-SQL rowset functions. Requires the "Full-Text and Semantic Extractions for Search" feature.
- Word Breakers and Stemmers: Updated for US and UK English (LCID 1033, 2057), with an option to revert to previous versions for compatibility.
- Customizable Stoplist and Thesaurus: Enhanced stoplists to filter out common words and customizable thesauri for synonym-based searches.
- Change Tracking: Supported AUTO change tracking for full-text indexes, automatically updating indexes as data changes.
- Performance: Relied on the Full-Text Engine for indexing and querying, with batch processing for crawls and inverted index creation.
- Limitations: Performance issues with complex queries or large datasets, as full-text searches often scanned all rows before applying other filters, leading to slower execution in some scenarios.
SQL Server 2014 to 2019
- Integration with FILESTREAM: Improved synergy with FILESTREAM for searching text in documents (e.g., PDFs, Word) stored in SQL Server.
- Language Support: Expanded language-specific word breakers and stemmers, with neutral word breakers for unsupported languages.
- Performance Tweaks: Incremental improvements in indexing and query performance, though core architecture remained similar.
- Express Edition: Full-Text Search available in SQL Server Express with Advanced Services, requiring specific installation steps.
SQL Server Full-Text Search Enhancements
- Performance Improvements: Optimized full-text indexing and querying, leveraging modern hardware for faster crawls and searches. Improved query plans to address some performance bottlenecks noted in earlier versions (e.g., reducing unnecessary full-table scans).
- Azure Synergy: Enhanced integration with Azure services, such as Azure Cognitive Search, for hybrid search scenarios, allowing full-text search to complement cloud-based AI-driven search.
- Security and Management: Improved security for full-text catalogs and indexes, with better permission management (e.g., VIEW SERVER STATE, VIEW DATABASE STATE).
- Semantic Search Refinements: Continued support for statistical semantic search with minor enhancements in key phrase extraction accuracy.
- Deprecation of Legacy Components: Some older word breakers or filters (e.g., Microsoft Office 2010 Filter Packs) are no longer supported, requiring updated configurations for semantic search.
- Monitoring and Diagnostics: Enhanced dynamic management views (e.g., sys.dm_fts_parser) for better troubleshooting and performance monitoring.
Comparison of SQL Server 2012 and 2022 Full-Text Search
- Performance: SQL Server 2022 offers better query optimization and hardware utilization, addressing some of the performance issues in 2012 where full-text searches could be slow for large datasets or complex queries.
- Semantic Search: Introduced in 2012, it remains largely unchanged but benefits from minor accuracy improvements and modern hardware in 2022.
- Installation and Compatibility: SQL Server 2022 requires careful configuration for semantic search due to deprecated legacy components, while 2012 relied on older filter packs.
- Cloud Integration: 2022 emphasizes hybrid scenarios with Azure, which was not a focus in 2012.
- Scalability: 2022 handles larger text datasets more efficiently, making it better suited for modern big data workloads.
Summary
While the core Full-Text Search architecture (Full-Text Engine, indexing, querying) remains consistent, SQL Server 2022 introduces performance optimizations, better Azure integration, and refined management features compared to 2012. Semantic search, a major addition in 2012, sees incremental improvements. Users upgrading to 2022 should note changes in legacy component support and plan for updated configurations. For detailed setup steps or specific use cases, refer to Microsoft Learn or SQL Server documentation.
The history of Full-Text Search began in late 1998 when Microsoft reengineered one of its search engines
(Site Server Search, designed for websites) to provide search services for SQL Server 7. The engine was
called MSSearch, and it also provided search services to Exchange Content Indexing and SharePoint Portal
Server 2001. I liked Full-Text Search when it was first introduced back in SQL Server 7, and I am glad that it is
still here and Microsoft is continuing to invest in it.
Microsoft continued to improve iFTS’s performance and scalability with SQL Server 2000 and SQL Server
2005. Also, in case you did not follow the evolution of Full-Text Search back in SQL Server 2005, Microsoft
worked on bringing Full-Text Search closer to industry standards:
- The list of noise words was renamed to the industry standard term of stoplist .
- The many set-up stored procedures were simplified into normal DDL CREATE, ALTER,
and DROP commands.
Before moving on to the next module, click the Quiz link below to check your knowledge of the material covered in this module with a short, multiple-choice quiz.
Full Text Searching - Quiz