Street_Excitement_14
Street_Excitement_14 t1_irmwhvt wrote
Reply to comment by redditnit21 in [D] CSV File to training and testing split by redditnit21
you are welcome. my advice is do not do anything manually, do it with Pandas. ie you can use pandas '.loc' command to filter training data, and write that data to training folder etc. İf stuck at any point, search the internet or ask it. good luck, have a nice day:)
Street_Excitement_14 t1_irmupgk wrote
u/in-your-own-words explaining nicely but I feel you lack basics, thus here are my two cents:
- You have 2 different entities, A. a csv file indicating path of the images and output class of the images, for each image, and B. the images itself.
- Maintain your sv file in a dataframe (ie in Pandas) and shuffle it.
- Create a new column which indicates images in that row is for training or for testing. (you can take indexes, and split them, or you can also stratified split ie take 20% test data from each class.) If you stuck, search web or ask stackoverflow. Pandas, sklearn and numpy are your friends.
- Now you have a dataframe that has image path, image class and train/test indicator. Using this dataframe, you can create train and test folder and copy/move corresponding images to those folder easily. You can also create two different csv file from your dataframe for train and test and save them as well.
- Alternatively, you can directly feed data from your dataframe to the learning algorithm. The loader function will take image path as an input, and load the image and feed the algorithm.
Do not ask for code:) As I have said, mostly pandas is your friend
Street_Excitement_14 t1_irmygtj wrote
Reply to comment by redditnit21 in [D] CSV File to training and testing split by redditnit21
Trust me it pays off well (even without the context of data science field, it gives you the ability to manage the tabular data effectively)
https://www.kaggle.com/learn/pandas
https://www.coursera.org/learn/python-data-analysis?irclickid=0uOThVwCHxyNT0H2N%3ASXpxqkUkDQrvVBXW5J1Q0&irgwc=1&utm_medium=partners&utm_source=impact&utm_campaign=3294490&utm_content=b2c
https://www.coursera.org/specializations/data-science-python