header_utils
Loading...
Searching...
No Matches
ghassanpl::string_ops Namespace Reference

Classes

struct  split_range
 A very basic "range" (not really a C++ range yet) that can be iterated over as if its a range of elements in source split by split_on. More...
 
struct  text_decode_result
 Shamelessly stolen from https://github.com/arc80/plywood/. More...
 
struct  text_encoding
 Type that represents a specific text encoding - a combination of ghassanpl::string_ops::text_encoding_type and endianness. More...
 
struct  utf8_view
 A simple view over an UTF8 string range with codepoint values. More...
 

Concepts

concept  string_or_char
 The type is a stringable or a character.
 
concept  stringable
 The type is "stringable", that is, a continuous range of characters.
 
concept  string8
 The type is a string with an 8-bit char type.
 
concept  stringable8
 The type is convertible to a string view with an 8-bit char type.
 
concept  string_view8
 The type is a string view with an 8-bit char type.
 
concept  string16
 The type is a string with a 16-bit char type.
 
concept  stringable16
 The type is convertible to a string view with a 16-bit char type.
 
concept  string_view16
 The type is a string view with a 16-bit char type.
 
concept  string32
 The type is a string with an 32-bit char type.
 
concept  stringable32
 The type is convertible to a string view with a 32-bit char type.
 
concept  string_view32
 The type is a string view with a 32-bit char type.
 
concept  charable
 Can a type be bit-cast to a native/utf char type?
 
concept  char_type
 Whether the type is a native char type.
 
concept  utf_type
 Whether the type is a utf char type.
 
concept  stringable_base_type
 Whether the type is a native or utf char type.
 

Typedefs

using wide_char16_t = std::conditional_t< sizeof(wchar_t)==sizeof(char16_t), wchar_t, char16_t >
 The default 16-bit char type for the current platform (wchar_t if it is 16-bit, char16_t otherwise)
 
using wide_char32_t = std::conditional_t< sizeof(wchar_t)==sizeof(char32_t), wchar_t, char32_t >
 The default 32-bit char type for the current platform (wchar_t if it is 32-bit, char32_t otherwise)
 
template<charable T>
using charable_utf_t = std::conditional_t< same_size_and_alignment< T, char8_t >, char8_t, std::conditional_t< same_size_and_alignment< T, char16_t >, char16_t, std::conditional_t< same_size_and_alignment< T, char32_t >, char32_t, void > > >
 The utf char type corresponding to the charable type.
 
template<charable T>
using charable_char_t = std::conditional_t< same_size_and_alignment< T, char >, char, std::conditional_t< same_size_and_alignment< T, wide_char16_t >, wide_char16_t, std::conditional_t< same_size_and_alignment< T, wide_char32_t >, wide_char32_t, void > > >
 The native char type corresponding to the charable type.
 
template<charable T>
using best_stringable_type = std::conditional_t< stringable_base_type< T >, T, charable_char_t< T > >
 

Enumerations

enum class  text_encoding_type {
  unknown , utf8 , utf16 , utf32 ,
  utf7 , utf1 , utf_ebcdic , scsu ,
  bocu1 , gb18030
}
 Specifies a base text-encoding, ignoring endianness for multi-byte encodings. More...
 
enum class  unicode_plane {
  unicode_plane::invalid , basic_multilingual_plane , supplementary_multilingual_plane , supplementary_ideographic_plane ,
  tertiary_ideographic_plane , supplementary_special_purpose_plane , private_use_plane_a , private_use_plane_b ,
  bmp , smp , sip , tip ,
  ssp , spua_a , pup_a , spua_b ,
  pup_b
}
 Represents the Unicode plane. More...
 

Functions

template<typename COUT , typename CIN >
requires charable<COUT> && charable<CIN> && same_size_and_alignment<COUT, CIN>
constexpr std::basic_string_view< COUTstring_view_cast (std::basic_string_view< CIN > id) noexcept
 Casts a string_view to a string_view with a different char type via a simple reinterpret_cast.
 
constexpr std::string_view back (std::string_view child_to_back_up, std::string_view parent, size_t n=1) noexcept
 Creates a string_view with its beginning moved back by n characters, limited to a parent range.
 
constexpr std::string_view back (std::string_view child_to_back_up, size_t n=1) noexcept
 Creates a string_view with its beginning moved back by n characters.
 
constexpr bool is_inside (std::string_view big_string, std::string_view smaller_string)
 Checks if smaller_string is a true subset of big_string (true subset meaning they view over overlapping memory subregions)
 
constexpr bool isascii (char32_t cp) noexcept
 Returns true if cp is an ascii codepoint.
 
constexpr bool is_ascii (char32_t cp) noexcept
 Returns true if cp is an ascii codepoint.
 
bool contains (std::string_view str, char c)
 A pre-C++23 version of str.contains(c)
 
std::string_view substr (std::string_view str, intptr_t start, size_t count=std::string::npos) noexcept
 Gets a substring of str starting at start and containing count characters.
 
std::string_view prefix (std::string_view str, size_t count) noexcept
 Returns a substring containing the count leftmost characters of str. Always valid, clamped to the bounds of str (or empty).
 
std::string_view without_suffix (std::string_view str, size_t count) noexcept
 Returns a substring created by removing count characters from the end. Always valid, clamped to the bounds of str (or empty).
 
std::string_view suffix (std::string_view str, size_t count) noexcept
 Returns a substring containing the count rightmost characters of str. Always valid, clamped to the bounds of str (or empty).
 
std::string_view without_prefix (std::string_view str, size_t count) noexcept
 Returns a substring created by removing count characters from the start. Always valid, clamped to the bounds of str (or empty).
 
void erase_outside_n (std::string &str, size_t start, size_t count) noexcept
 Erases all characters in str outside of the range [start, start + count]. Always safe.
 
void erase_outside_from_to (std::string &str, size_t from, size_t to) noexcept
 Erases all characters in str outside of the range [from, to].
 
template<std::ranges::random_access_range T>
requires stringable_base_type<std::ranges::range_value_t<T>>
constexpr bool isany (char32_t cp, T &&chars) noexcept
 Checks if cp is any of the characters in chars
 
constexpr bool isany (char32_t c, char32_t c2) noexcept
 A isany overload that takes a single character.
 
template<string_or_char NEEDLE, typename FUNC >
void find_all (std::string_view subject, NEEDLE &&search, FUNC &&func)
 
template<typename RESULT_TYPE = std::string_view, string_or_char NEEDLE>
std::vector< RESULT_TYPEfind_all (std::string_view subject, NEEDLE &&search)
 
std::string url_encode (std::string_view text)
 Returns a url-encoded version of the string.
 
std::string url_unencode (std::string_view text)
 Returns a url-decoded version of the string.
 
template<std::integral T>
auto from_chars (std::string_view str, T &value, const int base=10) noexcept
 A version of std::from_chars that takes a std::string_view as the first argument.
 
template<std::floating_point T>
auto from_chars (std::string_view str, T &value, const std::chars_format chars_format=std::chars_format::general) noexcept
 A version of std::from_chars that takes a std::string_view as the first argument.
 
template<typename RESULT_TYPE = std::string_view, typename T , typename FUNC >
requires std::is_arithmetic_v<T>&& std::is_invocable_r_v<T, FUNC, std::string_view>
std::vector< RESULT_TYPEword_wrap (std::string_view _source, T max_width, FUNC width_getter)
 Performs a basic word-wrapping split of _source, as if it was constrained to max_width.
 
template<typename RESULT_TYPE = std::string_view, typename T >
requires std::is_arithmetic_v<T>
std::vector< RESULT_TYPEword_wrap (std::string_view _source, T max_width, T letter_width)
 Word-wrapping function for constant-width characters.
 
size_t levenshtein_distance (std::string_view s1, std::string_view s2)
 
template<typename CALLBACK >
requires std::invocable<CALLBACK, size_t, std::string_view, std::string&>
std::string callback_format (std::string_view fmt, CALLBACK &&callback)
 
constexpr bool is_high_surrogate (char32_t cp) noexcept
 Returns whether cp is a codepoint that encodes the high part of a codepoint with a more-than-16-bit value.
 
constexpr bool is_low_surrogate (char32_t cp) noexcept
 Returns whether cp is a codepoint that encodes the low part of a codepoint with a more-than-16-bit value.
 
constexpr bool is_surrogate (char32_t cp) noexcept
 Returns whether cp is a codepoint that encodes any part of a codepoint with a more-than-16-bit value.
 
constexpr bool is_unicode (char32_t cp) noexcept
 Returns whether cp has a value that is a valid Unicode codepoint (ie. between 0 and 0x10FFFF).
 
constexpr bool is_unicode_character (char32_t cp) noexcept
 Returns whether cp has a value that is a valid Unicode character (ie.
 
constexpr char32_t surrogate_pair_to_codepoint (char32_t high, char32_t low) noexcept
 Returns the codepoint encoded by two surrogates.
 
constexpr auto get_unicode_plane (char32_t cp) noexcept -> unicode_plane
 
text_decode_result decode_codepoint (bytelike_range auto range, text_encoding encoding)
 Attempts to decode the first codepoint in bytelike range range, assuming it is encoded in encoding.
 
template<bytelike BYTE_TYPE, size_t N>
text_encoding consume_bom (std::span< BYTE_TYPE, N > &spn)
 Consumes (see consume()) a byte order mark from the beginning of spn (a span of bytelike), and returns the encoding that the BOM represents (or unknown_text_encoding if no BOM).
 
text_encoding consume_bom (string_view8 auto &sv)
 Consumes (see consume()) a byte order mark from the beginning of sv, and returns the encoding that the BOM represents (or unknown_text_encoding if no BOM).
 
text_encoding consume_bom (string_view16 auto &sv)
 Consumes (see consume()) a byte order mark from the beginning of sv, and returns the UTF-16 encoding that the BOM represents (or unknown_text_encoding if no BOM).
 
text_encoding consume_bom (string_view32 auto &sv)
 Consumes (see consume()) a byte order mark from the beginning of sv, and returns the UTF-32 encoding that the BOM represents (or unknown_text_encoding if no BOM).
 
template<bytelike_range T>
text_encoding detect_encoding (T const &range)
 Attempts to detect the encoding of a given bytelike range.
 
template<typename T >
constexpr char32_t consume_codepoint (T &str)
 Consumes a codepoint from a UTF-encoded string and returns it.
 
template<typename T >
constexpr void append_codepoint (T &str, char32_t cp)
 Appends a codepoint to a UTF-encoded string. Supports UTF-8, UTF-16 and UTF-32, decides based on char type of str.
 
template<typename TO , typename FROM >
constexpr void transcode_unicode (FROM const &from, TO &out)
 Converts a UTF-encoded string to a UTF-encoded string, of a different encoding. Decides the encodings based on the char type of TO and FROM.
 
template<typename TO , typename FROM >
constexpr TO transcode_unicode (FROM const &from)
 Converts a UTF-encoded string to a UTF-encoded string, of a different encoding. Decides the encodings based on the char type of TO and FROM.
 
template<typename T >
constexpr void transcode_codepage_to_unicode (T &dest, stringable8 auto source, std::span< char32_t const, 128 > codepage_map)
 Transcodes an Extended ASCII string source into unicode-encoded dest, according to codepage_map.
 
template<typename RESULT = std::string>
constexpr auto transcode_codepage_to_unicode (stringable8 auto source, std::span< char32_t const, 128 > codepage_map) -> RESULT
 Transcodes an Extended ASCII string source into a unicode encoding, according to codepage_map
 
constexpr std::pair< char32_t, char32_tcodepoint_to_surrogate_pair (char32_t cp) noexcept
 
template<string8 T>
constexprto_utf8 (char32_t cp)
 Returns cp encoded as a UTF-8 string.
 
template<string16 T>
constexprto_utf16 (char32_t cp)
 Returns cp encoded as a UTF-16 string.
 
template<string16 T, stringable8 STR>
constexprto_utf16 (STR str)
 Returns str (a UTF-8-encoded string) encoded as a UTF-16 string.
 
Make Functions

Functions that create string_view and string types from various values

Rationale
Even though C++20 has a range constructor, it uses operator- on its arguments, which means that iterators from disparate string_views do not work, even if they point to a contiguous string range (for example, were made from substrings of the same string_view). Hence these functions. They should work with any pair of iterators (no support for sentinels yet, unfortunately), support nulls, and are almost as strong as the string_view range constructor in terms of exception and type safety. As with the respective constructors, undefined behavior when start > end.
template<typename C = char>
constexpr std::basic_string_view< Cmake_sv (std::nullptr_t, std::nullptr_t) noexcept
 
template<stringable_base_type CT, std::contiguous_iterator IT, std::contiguous_iterator IT2>
requires charable<std::iter_value_t<IT>>
constexpr auto make_sv (IT start, IT2 end) noexcept(noexcept(std::to_address(start)))
 
template<std::contiguous_iterator IT, std::contiguous_iterator IT2>
requires charable<std::iter_value_t<IT>>
constexpr auto make_sv (IT start, IT2 end) noexcept(noexcept(std::to_address(start)))
 
template<typename T >
requires stringable_base_type<std::remove_cvref_t<T>>
constexpr auto make_sv (T &&single_char) noexcept
 
template<charable T>
constexpr auto make_sv (const T *str) noexcept
 
template<typename C >
constexpr std::basic_string_view< Cmake_sv (std::basic_string_view< C > id) noexcept
 
template<typename C >
constexpr std::basic_string_view< Cmake_sv (std::basic_string< C > const &id) noexcept
 
template<typename C >
constexpr std::basic_string_view< Cmake_sv (std::basic_string< C > &&id) noexcept=delete
 
template<std::ranges::range RANGE>
requires charable<std::ranges::range_value_t<RANGE>>
constexpr auto make_sv (RANGE &&range) noexcept
 
template<typename... NONARGS, typename... ARGS>
constexpr auto make_string (ARGS &&... args)
 
to_string Functions

Basic identity and utility to_string functions.

See also
Stringification
std::string to_string (std::string_view from) noexcept
 
std::string to_string (std::u8string_view from) noexcept
 
template<typename T >
requires requires { std::to_string(t); }
std::string to_string (T const &t)
 
constexpr std::string constto_string (std::same_as< std::string > auto const &s)
 
template<typename T >
std::string to_string (std::optional< T > const &o)
 
Trimming Functions

Functions that trim (remove ascii whitespace from) strings and string_views.

constexpr std::string_view trimmed_whitespace_right (std::string_view str) noexcept
 
constexpr std::string_view trimmed_whitespace_left (std::string_view str) noexcept
 
constexpr std::string_view trimmed_whitespace (std::string_view str) noexcept
 
constexpr std::string_view trimmed_until (std::string_view str, char chr) noexcept
 
constexpr std::string_view trimmed (std::string_view str, char chr) noexcept
 
constexpr std::string trimmed_whitespace_right (std::string str) noexcept
 
constexpr std::string trimmed_whitespace_left (std::string str) noexcept
 
constexpr std::string trimmed_whitespace (std::string str) noexcept
 
constexpr std::string trimmed_until (std::string str, char chr) noexcept
 
constexpr std::string trimmed (std::string str, char chr) noexcept
 
template<typename FUNC >
requires std::is_invocable_r_v<bool, FUNC, char>
std::string_view trimmed_while (std::string_view str, FUNC &&func) noexcept
 
constexpr void trim_whitespace_right (std::string_view &str) noexcept
 
constexpr void trim_whitespace_left (std::string_view &str) noexcept
 
constexpr void trim_whitespace (std::string_view &str) noexcept
 
constexpr void trim_until (std::string_view &str, char chr) noexcept
 
constexpr void trim (std::string_view &str, char chr) noexcept
 
template<typename FUNC >
requires std::is_invocable_r_v<bool, FUNC, char>
constexpr void trim_while (std::string_view &str, FUNC &&func) noexcept
 
Consume Functions

Functions that "consume" parts of a string_view (that is, remove a section from the beginning or end if the conditions apply).

Most of the functions return the consumed part, or 'true/false' if the part to be consumed is given explicitly. These functions do nothing (or the maximum safe amount) if there is nothing appropriate available to consume.

char consume (std::string_view &str)
 Consumes and returns the first character in the str, or \0 if no more characters.
 
bool consume (std::string_view &str, char val)
 Consumes the character val if it's at the beginning of str
 
bool consume (std::string_view &str, std::string_view val)
 Consumes the string val if it's at the beginning of str.
 
template<typename... ARGS>
char consume_any (std::string_view &str, ARGS &&... args)
 Consumes any of the characters in 'chars' if it's the first char of str.
 
template<typename PRED >
requires std::is_invocable_r_v<bool, PRED, char>
char consume (std::string_view &str, PRED &&pred)
 Consumes a character from the beginning of str if it matches pred(str[0]).
 
char consume_or (std::string_view &str, char or_else)
 Consumes the first character from str, returning it, or or_else if string is empty.
 
bool consume_at_end (std::string_view &str, char val)
 Consumes the last character from str if it matches val.
 
bool consume_at_end (std::string_view &str, std::string_view val)
 Consumes the string val from the end str
 
template<typename FUNC >
requires std::is_invocable_r_v<bool, FUNC, char>
std::string_view consume_while (std::string_view &str, FUNC &&pred)
 Consumes characters from the beginning of str while they match pred(str[0]).
 
std::string_view consume_while (std::string_view &str, char c)
 Consumes characters from the beginning of str while they are equal to c.
 
template<typename... ARGS>
std::string_view consume_while_any (std::string_view &str, ARGS &&... args)
 Consumes a run of any of the characters in 'chars' at the beginning of str.
 
template<typename FUNC >
requires std::is_invocable_r_v<bool, FUNC, char>
std::string_view consume_until (std::string_view &str, FUNC &&pred)
 Consumes characters from the beginning of str until one matches pred(str[0]), exclusive.
 
std::string_view consume_until (std::string_view &str, char c)
 Consumes characters from the beginning of str until one is equal to c, exclusive.
 
std::string_view consume_until (std::string_view &str, std::string_view end)
 Consumes characters from the beginning of str until the string starts with end, exclusive.
 
template<typename... ARGS>
std::string_view consume_until_any (std::string_view &str, ARGS &&... args)
 Consumes characters from the beginning of str until one is equal to any in the parameter pack, exclusive.
 
std::string_view consume_until_delim (std::string_view &str, char c)
 Consumes characters from the beginning of str until one is equal to c, inclusive.
 
std::string_view consume_n (std::string_view &str, size_t n)
 Consumes at most n characters from the beginning of str.
 
template<typename FUNC >
requires std::is_invocable_r_v<bool, FUNC, char>
std::string_view consume_n (std::string_view &str, size_t n, FUNC &&pred)
 Consumes at most n characters from the beginning of str that match pred(str[0]).
 
template<typename CALLBACK >
requires std::is_invocable_r_v<bool, CALLBACK, std::string_view&>
bool consume_delimited_list_non_empty (std::string_view &str, std::string_view delimiter, CALLBACK callback)
 Consumes a list of delimiter-delimited strings, calling callback(str) each time; whitespaces before and after items are trimmed.
 
template<typename CALLBACK >
requires std::is_invocable_r_v<bool, CALLBACK, std::string_view>
bool consume_delimited_list (std::string_view &str, std::string_view delimiter, std::string_view closer, CALLBACK callback)
 Consumes a list of delimiter-delimited strings, ended with closer, calling callback(str) each time; whitespaces before and after items are trimmed.
 
Split Functions

Functions that split strings into multiple parts, each delimited with some sort of delimiter.

template<typename FUNC >
requires std::is_invocable_v<FUNC, std::string_view, bool>
constexpr void split (std::string_view source, char delim, FUNC &&func) noexcept(noexcept(func(std::string_view{}, true)))
 Performs a basic "split" operation, calling func for each part of source delimited by delim.
 
template<typename FUNC >
requires std::is_invocable_v<FUNC, std::string_view, bool>
constexpr void split (std::string_view source, std::string_view delim, FUNC &&func) noexcept(noexcept(func(std::string_view{}, true)))
 Performs a basic "split" operation, calling func for each part of source delimited by delim.
 
template<typename FUNC >
requires std::is_invocable_v<FUNC, std::string_view, bool>
constexpr void split_on_any (std::string_view source, std::string_view delim, FUNC &&func) noexcept(noexcept(func(std::string_view{}, true)))
 Performs a basic "split" operation, calling func for each part of source delimited by any character in delim.
 
template<typename DELIM_FUNC , typename FUNC >
requires std::is_invocable_v<FUNC, std::string_view, bool>&& std::is_invocable_r_v<size_t, DELIM_FUNC, std::string_view>
void split_on (std::string_view source, DELIM_FUNC &&delim, FUNC &&func) noexcept(noexcept(func(std::string_view{}, true)) &&noexcept(delim(std::string_view{})))
 Performs a basic "split" operation, calling func for each part of source delimited by the delim function.
 
constexpr std::pair< std::string_view, std::string_view > split_at (std::string_view src, size_t split_at) noexcept
 Does not include the character at split_at in the returned strings.
 
constexpr bool split_at (std::string_view src, size_t split_at, std::string_view &first, std::string_view &second) noexcept
 Does not include the character at split_at in the returned strings.
 
constexpr std::pair< std::string_view, std::string_view > single_split (std::string_view src, char delim) noexcept
 Splits src once on the first instance of delim
 
constexpr std::pair< std::string_view, std::string_view > single_split_last (std::string_view src, char delim) noexcept
 Splits src once on the last instance of delim
 
constexpr bool single_split (std::string_view src, char delim, std::string_view &first, std::string_view &second) noexcept
 Splits src once on the first instance of delim
 
constexpr bool single_split_last (std::string_view src, char delim, std::string_view &first, std::string_view &second) noexcept
 Splits src once on the last instance of delim
 
template<typename FUNC >
requires std::is_invocable_v<FUNC, std::string_view, bool>
void natural_split (std::string_view source, char delim, FUNC &&func) noexcept
 Performs a more natural split of the string, that is: ignoring multiple delimiters in a row, and empty items.
 
template<typename RESULT_TYPE = std::string_view, string_or_char DELIM>
constexpr std::vector< RESULT_TYPEsplit (std::string_view source, DELIM &&delim) noexcept
 Performs a basic "split" operation, returning a std::vector of the split parts.
 
template<typename RESULT_TYPE = std::string_view>
constexpr std::vector< RESULT_TYPEsplit_on_any (std::string_view source, std::string_view delim) noexcept
 Performs a basic "split" operation, returning a std::vector of the split parts.
 
template<typename RESULT_TYPE = std::string_view, typename DELIM_FUNC >
requires std::is_invocable_r_v<size_t, DELIM_FUNC, std::string_view>
std::vector< RESULT_TYPEsplit_on (std::string_view source, DELIM_FUNC &&delim) noexcept(noexcept(delim(std::string_view{})))
 Performs a basic "split" operation, returning a std::vector of the split parts.
 
template<typename RESULT_TYPE = std::string_view, string_or_char DELIM>
std::vector< RESULT_TYPEnatural_split (std::string_view source, DELIM &&delim) noexcept
 Performs a more natural split of the string, that is: ignoring multiple delimiters in a row, and empty items; returns a std::vector of the split parts.
 
Join Functions

Functions that join a range of formattable elements into a single string

Note
Formatting is done using stream operators (operator<<).
Todo:
Use Stringification instead
template<std::ranges::range T>
auto join (T &&source)
 Returns a string that is created by joining together string representation of the elements in the source range.
 
template<std::ranges::range T, string_or_char DELIM>
auto join (T &&source, DELIM const &delim)
 Returns a string that is created by joining together string representation of the elements in the source range, separated by delim; delim is only added between elements.
 
template<std::ranges::range... RANGES, string_or_char DELIM>
auto join_multiple (DELIM const &delim, RANGES &&... sources)
 Returns a string that is created by joining together string representation of the elements in the sources ranges, separated by delim; delim is only added between elements.
 
template<std::ranges::range T, string_or_char DELIM, string_or_char LAST_DELIM>
auto join_and (T &&source, DELIM const &delim, LAST_DELIM &&last_delim)
 Returns a string that is created by joining together string representation of the elements in the source range, separated by delim; delim is only added between elements; the last element is delimited by last_delim instead of delim.
 
template<std::ranges::range T, string_or_char DELIM, string_or_char LAST_DELIM, typename FUNC >
auto join_and (T &&source, DELIM const &delim, LAST_DELIM &&last_delim, FUNC &&transform_func)
 Same as join(T&& source, DELIM const& delim, LAST_DELIM&& last_delim) except each element is transformed by transform_func before being stringified and added to the result.
 
template<std::ranges::range T, typename FUNC , string_or_char DELIM>
auto join (T &&source, DELIM const &delim, FUNC &&transform_func)
 Same as join(T&& source, DELIM const& delim) except each element is transformed by transform_func before being stringified and added to the result.
 
Replace and Escape Functions
Warning
A lot of these functions have stupid and/or subtle bugs, or are not intuitive in their behavior. Use at your own risk. Pull requests welcome.
Todo:
C++23's format has a string format specifications that automatically escape, use those when they become available
template<string_or_char NEEDLE, string_or_char REPLACE>
void replace (std::string &subject, NEEDLE &&search, REPLACE &&replace)
 
template<string_or_char NEEDLE, string_or_char REPLACE>
std::string replaced (std::string subject, NEEDLE &&search, REPLACE &&replace)
 
template<string_or_char DELIMITER = char, string_or_char ESCAPE = char>
void quote (std::string &subject, DELIMITER delimiter='"', ESCAPE escape = '\\')
 
template<string_or_char DELIMITER = char, string_or_char ESCAPE = char>
std::string quoted (std::string &&subject, DELIMITER &&delimiter='"', ESCAPE&& escape = '\\')
 
template<string_or_char DELIMITER = char, string_or_char ESCAPE = char>
std::string quoted (std::string_view subject, DELIMITER &&delimiter='"', ESCAPE&& escape = '\\')
 
template<string_or_char DELIMITER = char, string_or_char ESCAPE = char>
std::string quoted (const char *subject, DELIMITER &&delimiter='"', ESCAPE&& escape = '\\')
 
template<typename ESCAPE_FUNC >
requires std::is_invocable_v<ESCAPE_FUNC, std::string_view>&& std::is_constructible_v<std::string_view, std::invoke_result_t<ESCAPE_FUNC, std::string_view>>
void escape (std::string &subject, std::string_view chars_to_escape, ESCAPE_FUNC &&escape_func)
 
template<string_or_char ESCAPE = char>
void escape (std::string &subject, std::string_view chars_to_escape, ESCAPE &&escape='\\')
 
template<typename ESCAPE_FUNC , typename ISPRINTABLE_FUNC = decltype(ascii::isprint)>
void escape_non_printable (std::string &subject, ESCAPE_FUNC &&escape_func, ISPRINTABLE_FUNC &&isprintable_func=ascii::isprint)
 
void escape_non_printable (std::string &subject)
 
template<typename STR , string_or_char ESCAPE = char>
std::string escaped (STR &&subject, std::string_view to_escape="\"\\", ESCAPE &&escape_str='\\')
 Lint Note: Changing the initializer of to_escape to a R-string breaks doxygen.
 
template<typename STR >
std::string escaped_non_printable (STR &&subject)
 
sto* replacements

Functions equivalent to std::stoi, std::stod, etc that take std::string_view as its first argument

int stoi (std::string_view str, size_t *idx=nullptr, int base=10)
 
long stol (std::string_view str, size_t *idx=nullptr, int base=10)
 
long long stoll (std::string_view str, size_t *idx=nullptr, int base=10)
 
unsigned long stoul (std::string_view str, size_t *idx=nullptr, int base=10)
 
unsigned long long stoull (std::string_view str, size_t *idx=nullptr, int base=10)
 
float stof (std::string_view str, size_t *idx=nullptr, std::chars_format format=std::chars_format::general)
 
double stod (std::string_view str, size_t *idx=nullptr, std::chars_format format=std::chars_format::general)
 
long double stold (std::string_view str, size_t *idx=nullptr, std::chars_format format=std::chars_format::general)
 
UTF-8 functions
constexpr size_t codepoint_utf8_count (char32_t cp) noexcept
 Returns the number of UTF-8 octets necessarity to encode the given codepoint.
 
constexpr char32_t consume_utf8 (string_view8 auto &str)
 Consumes (see consume()) a UTF-8 codepoint from str.
 
constexpr size_t count_utf8_codepoints (stringable8 auto str)
 Returns the number of codepoints in the given UTF-8 string str
 
constexpr size_t append_utf8 (string8 auto &buffer, char32_t cp)
 Appends octets to buffer by encoding cp into UTF-8.
 
template<string8 RESULT = std::string>
constexpr RESULT to_utf8 (char32_t cp)
 Returns cp encoded as a UTF-8 string.
 
template<string8 RESULT = std::string, stringable16 STR>
constexpr RESULT to_utf8 (STR &&str)
 Returns str (a UTF-16-encoded string) encoded as a UTF-8 string.
 
std::string to_string (std::wstring_view str)
 Returns str (a UTF-16-encoded string) encoded as a UTF-8 string.
 
constexpr void transcode_codepage_to_utf8 (string8 auto &dest, stringable8 auto source, std::span< char32_t const, 128 > codepage_map)
 Transcodes an Extended ASCII string source into UTF-8 dest, according to codepage_map
 
template<string8 RESULT = std::string>
constexpr auto transcode_codepage_to_utf8 (stringable8 auto source, std::span< char32_t const, 128 > codepage_map) -> RESULT
 Transcodes an Extended ASCII string source into UTF-8, according to codepage_map
 
UTF-16 functions
constexpr char32_t consume_utf16 (string_view16 auto &str)
 Consumes (see consume()) a UTF-16 codepoint from str.
 
constexpr size_t append_utf16 (string16 auto &buffer, char32_t cp)
 Appends 16-bit values to buffer by encoding cp into UTF-16.
 
template<string16 RESULT = std::wstring>
constexpr RESULT to_utf16 (char32_t cp)
 Returns cp encoded as a UTF-16 string.
 
template<string16 RESULT = std::wstring, stringable8 STR>
constexpr RESULT to_utf16 (STR str)
 Returns str (a UTF-8-encoded string) encoded as a UTF-16 string.
 
std::wstring to_wstring (std::string_view str)
 Returns str (a UTF-8-encoded string) encoded as a UTF-16/32 string in a std::wstring (depending on the size of wchar_t)
 
UTF-32 functions
constexpr char32_t consume_utf32 (string_view32 auto &str)
 Consumes (see consume()) a UTF-32 codepoint from str.
 
constexpr size_t append_utf32 (string32 auto &buffer, char32_t cp)
 Appends 32-bit values to buffer by encoding cp into UTF-32.
 

Variables

constexpr char32_t last_unicode_code_point
 
constexpr char32_t first_unicode_high_surrogate
 
constexpr char32_t last_unicode_high_surrogate
 
constexpr char32_t first_unicode_low_surrogate
 
constexpr char32_t last_unicode_low_surrogate
 
Encodings

Values representing UTF encodings

constexpr text_encoding utf8_encoding
 
constexpr text_encoding utf16_le_encoding
 
constexpr text_encoding utf16_be_encoding
 
constexpr text_encoding utf32_le_encoding
 
constexpr text_encoding utf32_be_encoding
 
constexpr text_encoding unknown_text_encoding
 Represents an unknown text encoding (e.g. when an encoding could not be determined)
 

Detailed Description

Function Documentation

◆ codepoint_to_surrogate_pair()

constexpr std::pair< char32_t, char32_t > ghassanpl::string_ops::codepoint_to_surrogate_pair ( char32_t  cp)
constexprnoexcept

Definition at line 431 of file unicode.h.

Variable Documentation

◆ first_unicode_high_surrogate

constexpr char32_t ghassanpl::string_ops::first_unicode_high_surrogate
inlineconstexpr

Definition at line 414 of file unicode.h.

◆ first_unicode_low_surrogate

constexpr char32_t ghassanpl::string_ops::first_unicode_low_surrogate
inlineconstexpr

Definition at line 416 of file unicode.h.

◆ last_unicode_code_point

constexpr char32_t ghassanpl::string_ops::last_unicode_code_point
inlineconstexpr

Definition at line 413 of file unicode.h.

◆ last_unicode_high_surrogate

constexpr char32_t ghassanpl::string_ops::last_unicode_high_surrogate
inlineconstexpr

Definition at line 415 of file unicode.h.

◆ last_unicode_low_surrogate

constexpr char32_t ghassanpl::string_ops::last_unicode_low_surrogate
inlineconstexpr

Definition at line 417 of file unicode.h.