kevindamm
kevindamm t1_j6qmixr wrote
Reply to comment by 9-11GaveMe5G in OpenAI releases tool to detect AI-generated text, including from ChatGPT by whitecastle92
There are four buckets (of unequal size) but I don't know if success was measured by landing within the "correct" bucket or being within the highest p(AI-gen) bucket as TP, or both extreme top and bottom buckets. I only read the journalistic article and not the original research, so idk. 1000 character minimum worries me more, there's quite a lot of text smaller than that (like this comment).
kevindamm t1_j6q2lwy wrote
1000 character minimum and 26% success rate, but it's good that they're working on it
kevindamm t1_jbq3w44 wrote
Reply to [D] What's the Time and Space Complexity of Transformer Models Inference? by Smooth-Earth-9897
The analysis isn't as straightforward as that, for a few reasons. Transformer architectures are typically a series of alternating Multi-Head Attention (MHA) and Multi-Layer Perceptron (MLP) networks. The MHA may merge the heads from multiple MLPs. Each layer in the network is dominated by a matrix multiply and if it were all being computed on a CPU then a reasonable upper bound would be O(n^3 ) where n is the widest layer. But the bottleneck isn't based on how many multiplies a CPU would have to do because we are typically using a GPU or TPU to process it and these can parallelize a lot of the additions and multiplies of the matrix ops. The real bottleneck is often the memory copies going to and from the GPU or TPU, and this will vary greatly based on the model size, GPU memory limits, batch processing size, etc.
You're better off profiling performance for a particular model and hardware combination.