c ++ записывает количество слов в каждой строке [дубликат]

Question

c ++ записывает количество слов в каждой строке [дубликат]

Обходным решением может быть создание файла (т.е. list.txt) без ничего внутри, в этом файле вы можете установить настраиваемые метаданные (это Map & lt; String, String>) со списком всего URL-адреса файла. Поэтому, если вам нужно downlaod всех файлов в fodler, вы сначала загружаете метаданные из файла list.txt, затем выполняете итерацию по пользовательским данным и загружаете все файлы с URL-адресами на карте.

12

visual-studio

задан Chubsdad 9 September 2010 в 02:37

8 ответов

Другое решение, основанное на boost, которое может работать (непроверено):

vector<string> result;
split(result, "aaaa bbbb cccc", is_any_of(" \t\n\v\f\r"), token_compress_on);

Более подробную информацию можно найти в библиотеке алгоритмов Boost String Algorithms

4

ответ дан Christopher Hunt 5 September 2018 в 10:10

1

– Billy ONeal 9 September 2010 в 03:02
2

– the_mandrill 9 September 2010 в 20:45
3

– the_mandrill 9 September 2010 в 20:46
4

– Christopher Hunt 10 September 2010 в 00:53

Алгоритм с одним проходом для этой проблемы может быть следующим:

Если строка пуста, возвращаем 1

, пусть переходы = число соседних пар пар (c1, c2), где c1 == ' ' и c2 != ' '

, если предложение начинается с пробела, return transitions else return transitions + 1

Вот пример со строкой = «Очень, очень, очень, очень, очень большая собака съела мою домашнюю работу !!!!»
 i | 0123456789
c2 |  A very, very, very, very, very big dog ate my homework!!!!
c1 | A very, very, very, very, very big dog ate my homework!!!!
   |  x     x     x     x     x    x   x   x   x  x
Объяснение
Let `i` be the loop counter.

When i=0: c1='A' and c2=' ', the condition `c1 == ' '` and `c2 != ' '` is not met
When i=1: c1=' ' and c2='A', the condition is met
... and so on for the remaining characters
Этот алгоритм правильно обрабатывает случаи с несколькими пробелами

Вот два решения, которые я придумал с помощью

Наивное решение
size_t count_words_naive(const std::string_view& s)
{
    if (s.size() == 0) return 0;
    size_t count = 0;
    bool isspace1, isspace2 = true;
    for (auto c : s) {
        isspace1 = std::exchange(isspace2, isspace(c));
        count += (isspace1 && !isspace2);
    }
    return count;
}
Если вы тщательно подумаете, сможет свести этот набор операций к внутреннему продукту, что было сделано ниже.

Внутреннее prod-решение
size_t count_words_using_inner_prod(const std::string_view& s)
{
    if (s.size() == 0) return 0;
    auto starts_with_space = isspace(s.front());
    auto num_transitions = std::inner_product(
            s.begin()+1, s.end(), s.begin(), 0, std::plus<>(),
            [](char c2, char c1) { return isspace(c1) && !isspace(c2); });
    return num_transitions + !starts_with_space;
}

0

ответ дан Lakshay Garg 5 September 2018 в 10:10

Предполагаемые слова разделены пробелами:

unsigned int countWordsInString(std::string const& str)
{
    std::stringstream stream(str);
    return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}

Примечание. Между словами может быть более одного пробела. Кроме того, это не захватывает другие символы пробела, такие как вкладка новой строки или возврат каретки. Так что подсчета пробелов недостаточно.

Оператор ввода потока >> при использовании для чтения строки из потока. Читает одно слово, разделенное пробелом. Поэтому они, вероятно, искали, чтобы использовать это для идентификации слов.

std::stringstream  stream(str);
std::string        oneWord;

stream >> oneWord; // Reads one space separated word.

Когда это можно использовать для подсчета слов в строке.

std::stringstream  stream(str);
std::string        oneWord;
unsigned int       count = 0;

while(stream >> oneWord) { ++count;}
// count now has the number of words in the string.

Сложность: потоки могут обрабатываться точно так же, как и любой другой контейнер, и есть итераторы для их прогона std :: istream_iterator. Когда вы используете оператор ++ на istream_iterator, он просто считывает следующее значение из потока с помощью оператора >>. В этом случае мы читаем std :: string, поэтому читаем слово, разделенное пробелом.

std::stringstream  stream(str);
std::string        oneWord;
unsigned int       count = 0;

std::istream_iterator loop = std::istream_iterator<std::string>(stream);
std::istream_iterator end  = std::istream_iterator<std::string>();

for(;loop != end; ++count, ++loop) { *loop; }

Использование std :: distance просто обертывает все вышеперечисленное в аккуратном пакете, поскольку находит расстояние между двумя итераторами сделав ++ первым, пока мы не достигнем второго.

Чтобы избежать копирования строки, мы можем быть подлыми:

unsigned int countWordsInString(std::string const& str)
{
    std::stringstream stream;

    // sneaky way to use the string as the buffer to avoid copy.
    stream.rdbuf()->pubsetbuf (str.c_str(), str.length() );
    return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}

Примечание: мы все еще копируем каждое слово из оригинал во временный. Но стоимость этого минимальна.

31

ответ дан Martin York 5 September 2018 в 10:10

1

– Billy ONeal 8 September 2010 в 23:29
2

– Martin York 8 September 2010 в 23:32
3

– dash-tom-bang 8 September 2010 в 23:45
4

– Billy ONeal 8 September 2010 в 23:50
5

– dash-tom-bang 8 September 2010 в 23:57

Это можно сделать без ручного просмотра каждого символа или копирования строки.

#include <boost/iterator/transform_iterator.hpp>
#include <cctype>

boost::transform_iterator
    < int (*)(int), std::string::const_iterator, bool const& >
    pen( str.begin(), std::isalnum ), end( str.end(), std::isalnum );

size_t word_cnt = 0;

while ( pen != end ) {
    word_cnt += * pen;
    pen = std::mismatch( pen+1, end, pen ).first;
}

return word_cnt;

Я воспользовался isalnum вместо isspace.

Это не то, что я сделал бы на собеседовании. (Это не то, что он скомпилирован в первый раз.)

Или, для всех ненавистников Boost; v)

if ( str.empty() ) return 0;

size_t word_cnt = std::isalnum( * str.begin() );

for ( std::string::const_iterator pen = str.begin(); ++ pen != str.end(); ) {
    word_cnt += std::isalnum( pen[ 0 ] ) && ! std::isalnum( pen[ -1 ] );
}

return word_cnt;

3

ответ дан Potatoswatter 5 September 2018 в 10:10

1

– Cubbi 9 September 2010 в 00:32
2

– Potatoswatter 9 September 2010 в 00:41
3

– blucz 9 September 2010 в 00:47
4

– dash-tom-bang 9 September 2010 в 00:52
5

– Potatoswatter 9 September 2010 в 00:55

Очень сжатый подход O (N):

bool is_letter(char c) { return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z'; }

int count_words(const string& s) {
    int i = 0, N = s.size(), count = 0;
    while(i < N) {
        while(i < N && !is_letter(s[i])) i++;
        if(i == N) break;
        while(i < N && is_letter(s[i])) i++;
        count++;
    }
    return count;
}

Подход с разделением и победой, сложность также O (N):

int DC(const string& A, int low, int high) {
    if(low > high) return 0;
    int mid = low + (high - low) / 2;

    int count_left = DC(A, low, mid-1);
    int count_right = DC(A, mid+1, high);

    if(!is_letter(A[mid])) 
        return count_left + count_right;
    else {
        if(mid == low && mid == high) return 1;
        if(mid-1 < low) {
            if(is_letter(A[mid+1])) return count_right;
            else return count_right+1;
        } else if(mid+1 > high) {
            if(is_letter(A[mid-1])) return count_left;
            else return count_left+1;
        }
        else {
            if(!is_letter(A[mid-1]) && !is_letter(A[mid+1])) 
                return count_left + count_right + 1;
            else if(is_letter(A[mid-1]) && is_letter(A[mid+1]))
                return count_left + count_right - 1;
            else
                return count_left + count_right;
        }
    }
}

int count_words_divide_n_conquer(const string& s) {
    return DC(s, 0, s.size()-1);
}

0

ответ дан sunjerry 5 September 2018 в 10:10

2

ответ дан Tecoberg 5 September 2018 в 10:10

Решение O (N), которое также очень просто понять и реализовать:

(я не проверял пустую строку ввода, но я уверен, что вы можете сделать это легко.)

#include <iostream>
#include <string>
using namespace std;

int countNumberOfWords(string sentence){
    int numberOfWords = 0;
    size_t i;

    if (isalpha(sentence[0])) {
        numberOfWords++;
    }

    for (i = 1; i < sentence.length(); i++) {
        if ((isalpha(sentence[i])) && (!isalpha(sentence[i-1]))) {
            numberOfWords++;
        }
    }

    return numberOfWords;
}

int main()
{
    string sentence;
    cout<<"Enter the sentence : ";
    getline(cin, sentence);

    int numberOfWords = countNumberOfWords(sentence);
    cout<<"The number of words in the sentence is : "<<numberOfWords<<endl;

    return 0;
}

1

ответ дан totjammykd 5 September 2018 в 10:10

Другие вопросы по тегам:

visual-studio

c ++ записывает количество слов в каждой строке [дубликат]

8 ответов

Похожие вопросы: