Submitted by bo_peng t3_11f9k5g in MachineLearning
bo_peng OP t1_jaixxp5 wrote
Reply to comment by satireplusplus in [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng
Thank you :) I was using the markdown mode instead because I didn't know this
KerfuffleV2 t1_jaiz1k8 wrote
Unfortunately, that doesn't work on the old reddit layout. We just see a garbled mess.
Here's a fixed version of the code/examples:
(not my content)
Example:
'cuda:0 fp16 *10 -> cuda:1 fp16 *8 -> cpu fp32'
= first 10 layers on cuda:0 fp16, then 8 layers on cuda:1 fp16, then on cpu fp32
'cuda fp16 *20+'
= first 20 layers on cuda fp16, then stream the rest on it
os.environ['RWKV_JIT_ON'] = '1'
os.environ["RWKV_CUDA_ON"] = '0' # if '1' then compile CUDA kernel for seq mode (much faster)
from rwkv.model import RWKV
from rwkv.utils import PIPELINE, PIPELINE_ARGS
pipeline = PIPELINE(model, "20B_tokenizer.json") # find it in https://github.com/BlinkDL/ChatRWKV
# download models: https://huggingface.co/BlinkDL
model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-169m/RWKV-4-Pile-169M-20220807-8023', strategy='cpu fp32')
ctx = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
print(ctx, end='')
def my_print(s):
print(s, end='', flush=True)
# For alpha_frequency and alpha_presence, see "Frequency and presence penalties":
# https://platform.openai.com/docs/api-reference/parameter-details
args = PIPELINE_ARGS(temperature = 1.0, top_p = 0.7,
alpha_frequency = 0.25,
alpha_presence = 0.25,
token_ban = [0], # ban the generation of some tokens
token_stop = []) # stop generation whenever you see any token here
pipeline.generate(ctx, token_count=512, args=args, callback=my_print)
I kind of want to know what happens in the story...
bo_peng OP t1_jaj2pr2 wrote
strange. all spaces are lost even when i add 4 spaces in front of all code lines
UPDATE: works in markdown editor :)
Viewing a single comment thread. View all comments