Close Menu
    Trending
    • How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures
    • Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025
    • How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins
    • Become a Better Data Scientist with These Prompt Engineering Tips and Tricks
    • Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025
    • Transform Complexity into Opportunity with Digital Engineering
    • OpenAI Is Fighting Back Against Meta Poaching AI Talent
    • Lessons Learned After 6.5 Years Of Machine Learning
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Is Complex Writing Nothing But Formulas? | by Vered Zimmerman | Dec, 2024
    Artificial Intelligence

    Is Complex Writing Nothing But Formulas? | by Vered Zimmerman | Dec, 2024

    Team_AIBS NewsBy Team_AIBS NewsDecember 13, 2024No Comments9 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Textual content analytics hints at how volumes of writing get created

    Towards Data Science

    Within the broadest of strokes, Pure Language Processing transforms language into constructs that may be usefully manipulated. Since deep-learning embeddings have confirmed so highly effective, they’ve additionally turn out to be the default: decide a mannequin, embed your information, decide a metric, do some RAG. So as to add new worth, it helps to have a special tackle crunching language.
    The one I’ll share at the moment began years in the past, with a single guide.

    The Orchid Thief is each non-fiction and stuffed with mischief. I had first learn it in my 20s, skipping a lot of the historic anecdata, itching for its first-person accounts. On the time, I laughed out loud however turned the pages in quiet fury, that somebody may reside so deeply and write so effectively. I wasn’t all that positive these have been various things.

    Inside a yr I had moved to London to start out anew.
    I went into monetary companies, which is sort of a theme park for nerds. And, for the following decade, would solely take jobs with a number of writing.

    Heaps being the operative phrase.

    Behind the trendy façade {of professional} companies, British business is alive to its outdated factories and shipyards. It employs Alice to do a factor, after which hand it over to Bob; he turns some screws, and it’s on to Charlie. One month on, all of us do it once more. As a newcomer, I seen habits weren’t a lot a ditch to fall into, however a mound to stake.

    I used to be additionally studying tons. Okay, I used to be studying the New Yorker. My most favorite factor was to flip a contemporary one on its cowl, open it from the again, and browse the opening sentences of 1, Anthony Lane, who writes movie opinions. Years and years, not as soon as did I am going see a film.

    Each on occasion, a flicker would catch me off-guard. A barely-there thread between the New Yorker corpus and my non-Pulitzer outputs. In each corpora, every bit was totally different to its siblings, but additionally…not fairly. Similarities echoed. And I knew those in my work had arisen out of a repetitive course of.

    In 2017 I started meditating on the edge separating writing that feels formulaic from one that may be explicitly written out as a method.

    The argument goes like this: quantity of repetition hints at a (usually tacit) type of algorithmic decision-making. However procedural repetition leaves fingerprints. Hint the fingerprints to floor the process; suss out the algorithm; and the software program virtually writes itself.

    In my final job, I used to be now not writing tons. My software program was.

    Firms can, in precept, be taught sufficient about their very own flows to reap huge beneficial properties, however few hassle. Of us appear way more enthralled with what someone else is doing.

    For instance, my bosses, and later my shoppers, stored wishing their workers may mimic the Economist’s home type. However how would you discover which steps the Economist takes to finish up sounding the way in which it does?

    Picture by writer

    Enter Textual content Analytics

    Learn a single Economist article, and it feels breezy and assured. Learn a number of them, they usually sound sort of alike. A full printed journal comes out as soon as per week. Yeah, I used to be betting on course of.

    For enjoyable, let’s apply a readability perform (measured in years of training) to a number of hundred Economist articles. Let’s additionally do the identical to a whole bunch of articles revealed by a pissed off European asset supervisor.

    Then, let’s get ourselves a histogram to see how these readability scores are distributed.

    Simply two features, and have a look at the insights we get!

    Readability profile. Supply: FinText

    Discover how separated the curves are; this asset supervisor is not sounding just like the Economist. We may drill additional to see what’s inflicting this disparity. (For a begin, it’s typically crazy-long sentences.)

    But in addition, discover how the Economist places a tough restrict on the readability rating they permit. The curve is inorganic, betraying they apply a strict readability examine of their modifying course of.

    Lastly — and lots of of my shoppers struggled with this — the Economist vows to put in writing plainly sufficient that a mean highschooler may take it in.

    I had anticipated these charts. I had scribbled them on paper. However when an actual one first lit up my display, it was as if language herself had giggled.

    Now, I wasn’t precisely the primary on the scene. In 1964, statisticians Frederick Mosteller and David Wallace landed on the duvet of Time journal, their forensic literary evaluation settling a 140-year old debate over the authorship of a famed dozen of anonymously-written essays.

    However forensic analytics all the time seems on the single merchandise in relation to 2 corpora: the one created by the suspected writer, and the null speculation. Comparative analytics solely cares about evaluating our bodies of textual content.

    Picture by writer

    Constructing A Textual content Analytics Engine

    Let’s retrace our steps: given a corpus, we utilized the identical perform on every of the texts (the readability perform). This mapped the corpus onto a set (on this case, numbers). On this set we utilized one other perform (the histogram). Lastly, we did it to 2 totally different corpora — and in contrast the outcomes.

    For those who squint, you’ll see I’ve simply described Excel.

    What seems like a desk is definitely a pipeline, crunching columns sequentially. First alongside the column, adopted by features on the outcomes, adopted by comparative evaluation features.

    Effectively, I needed Excel, however for textual content.

    Not strings — textual content. I needed to use features like Depend Verbs or First Paragraph Topicor First Vital Sentence. And it needed to be versatile sufficient so I may ask any query; who is aware of what would find yourself mattering?

    In 2020 this type of answer didn’t exist, so I constructed it. And boy did this software program not ‘virtually write itself’! Making it potential to ask any query wanted some good structure choices, which I obtained mistaken twice earlier than ironing out the kinks.

    In the long run, features are outlined as soon as, by what they do to a single enter textual content. Then, you decide and select the pipeline steps, and the corpora on which they act.

    With that, I began a writing-tech consulting firm, FinText. I deliberate to construct whereas working with shoppers, and see what sticks.

    What the Market Mentioned

    The primary business use case I got here up with was social listening. Market analysis and polling are large enterprise. It’s now the peak of the pandemic, everybody’s at dwelling. I figured that processing energetic chatter on devoted on-line communities may very well be a brand new technique to entry shopper considering.

    Any first software program shopper would have felt particular, however this one was thrilling, as a result of my concoction really helped actual folks get out of a decent spot:

    Working in direction of an enormous occasion, they’d deliberate to launch a flagship report, with information from a paid YouGov survey. However its outcomes have been tepid. So, with their remaining price range, they purchased a FinText examine. It was our findings that they put entrance and centre of their final report.

    Social listening on Reddit ‘Investing’, 2020. Supply: FinText

    However social listening didn’t take off. Funding land is quirky as a result of swimming pools of cash will all the time want a house; the one query is who’s the owner. Business folks I talked to largely needed to know what their rivals have been as much as.

    So the second use case — aggressive content material analytics — was met with hotter response. I bought about half a dozen firms on this answer (together with, for instance, Aviva Investors).

    All alongside, our engine was amassing information nobody else had. Such was my savvy, it wasn’t even my concept to run coaching classes, a shopper first requested for one. That’s how I discovered firms like shopping for coaching.

    In any other case, my steampunk tackle writing was proving tough to promote. It was all too summary. What I wanted was a dashboard: fairly charts, with actual numbers, crunched from reside information. A pipeline did the crunching, and I employed a small group to do the beautiful charts.

    Textual content analytics dashboard demo. Supply: FinText

    Throughout the dashboard, two charts confirmed a breakdown of matters, and the remainder dissected the writing type. I’ll say a number of phrases about this selection.

    Everybody believes what they are saying issues. If others don’t care, actually it’s a ethical failure, of weighing type over substance. A bit like how unhealthy style is one thing solely different folks have.

    Scientists have counted clicks, tracked eyes, monitored scrolls, timed consideration. We all know it takes a break up second for readers to resolve whether or not one thing is “for them”, they usually resolve by vaguely evaluating new info to what they already like. Model is an entry cross.

    What The Dashboard Confirmed

    Earlier than, I hadn’t been monitoring the information being collected, however now I had all these fairly charts. They usually have been displaying I had been each proper, and really, very mistaken.

    Initially, I solely had direct information of some giant funding corporations, and had suspected their rivals’ flows look a lot the identical. This proved right.

    However I had additionally assumed that barely smaller firms would have solely barely fewer outputs. This simply isn’t true.

    Textual content analytics proved useful if an organization already had writing manufacturing capability. In any other case, what they wanted was a working manufacturing unit. There have been too few firms within the first bucket, as a result of everybody else was crowding the second.

    Epilogue

    As a product, textual content analytics has been a blended bag. It made some cash, may have most likely made some extra, however was unlikely to turn out to be a runaway success.

    Additionally, I’d misplaced my urge for food for the New Yorker. In some unspecified time in the future all of it tipped too far on the aspect of formulaic, and the magic was gone.

    Phrases at the moment are of their wholesale period, what with giant language fashions like ChatGPT. Early on, I thought-about making use of pipelines to discern whether or not textual content is machine generated, however what could be the purpose?

    As an alternative, in late 2023 I started engaged on an answer that helps firms broaden their capability to put in writing for skilled shoppers. It’s an altogether totally different journey, nonetheless in its infancy.

    In the long run, I got here to think about textual content analytics as an additional pair of glasses. Once in a while, it turns fuzziness sharp. I preserve it in my pocket, simply in case.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe Complete Guide to NetSuite Saved Searches
    Next Article Google’s Biggest Bet Is Bringing AI to Search, Says CIO
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Artificial Intelligence

    Lessons Learned After 6.5 Years Of Machine Learning

    July 1, 2025
    Artificial Intelligence

    Prescriptive Modeling Makes Causal Bets – Whether You Know it or Not!

    June 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    AI Is Not Taking Our Jobs. A quantitative exploration of the… | by Arthur Mello | Data Science Collective | Feb, 2025

    February 11, 2025

    Turning Product Data into Strategic Decisions

    May 1, 2025

    7 Challenges That Introverted Entrepreneurs Face — and How to Overcome Them

    January 29, 2025
    Our Picks

    How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures

    July 1, 2025

    Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025

    July 1, 2025

    How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.