Programa más corto para dividir una cadena de caracteres sin dígitos sin RegExps

16

EDITAR: si está utilizando Lisp, he dado algunas pautas en la parte inferior en el conteo de bytes.

Objetivo: realizar la función más corta que divide una cadena en no dígitos y devuelve una matriz que consta de solo dígitos en cada cadena, sin el uso de ninguna expresión regular. Los ceros iniciales se incluirán en cada cadena.

Posiciones actuales (separadas en categorías):

  • C / C ++ / C # / Java: 68 (C) ....
  • GolfScript / APL / J: 13 (APL)
  • Todos los demás: 17 (Bash, usos tr), 24 (Ruby)

Reglas:

(Pido disculpas por la tardanza)

  1. El formato debe ser una función con un solo argumento de cadena. Se pueden agregar hasta dos argumentos adicionales si es necesario para el retorno adecuado de la matriz (por ejemplo, sh / csh / DOS Batch necesita una referencia de variable adicional para devolver, etc.).
  2. La declaración de función principal no cuenta, y tampoco importa otras bibliotecas estándar. `# include`s,` import`s y `using`s no cuentan. Todo lo demás lo hace. Esto incluye `# define`s y funciones de ayuda. Perdón por la confusion. Consulte esto como una guía útil sobre lo que cuenta / no cuenta (escrito en sintaxis de estilo C)
    // no cuenta para el total, puede omitirse a menos que
    // no obvio, como la mitad de la biblioteca estándar de Java.
    #include <stdio.h>
    
    importar some.builtin.Class // no cuenta, ver arriba
    
    #define printf p // cuenta para el total
    
    / * Cualquier otra directiva de preprocesador, etc. cuenta. * /
    
    int i = 0; // cuenta
    
    someFunction (); // cuenta
    
    char [] [] myMainSplitFunction (char [] [] array) {// no cuenta
      // Todo aquí cuenta
      return returnArray; // Incluso esto cuenta.
    } // no cuenta
    
    / * Todo aquí cuenta, incluida la declaración * /
    char [] [] someHelperFunction (cadena char []) {
      // cosas
    } // incluso esto cuenta
    
  3. La salida debe ser una matriz de cadenas o similar (se aceptan listas de matrices en Java y similares). Ejemplos de salida aceptada: String[], char[][], Array, List, y Array(objeto).
  4. La matriz debe contener solo primitivas de cadena de longitud variable u objetos de cadena. No deben aparecer cadenas vacías en la devolución, con la excepción a continuación. Nota: las cadenas deben contener una cadena de coincidencias consecutivas, como la entrada y salida de ejemplo a continuación.
  5. Si no hay coincidencias, el cuerpo de la función debería devolver null, una matriz / lista vacía, o una matriz / lista que contiene una cadena vacía.
  6. No se permiten bibliotecas externas.
  7. Las terminaciones de línea de DOS cuentan como un byte, no dos (ya cubiertas en meta, pero deben enfatizarse)
  8. Y la regla más importante aquí: no se permiten expresiones regulares.

Esta es una pregunta de , por lo que gana el tamaño más pequeño. ¡Buena suerte!

Y aquí hay algunos ejemplos de entradas y salidas (con escapes de estilo C):

Entrada: "abc123def456"
Salida: ["123", "456"]

Entrada: "aitew034snk582: 3c"
Salida: ["034", "582", "3"]

Entrada: "as5493tax54 \\ [email protected]"
Salida: ["5493", "54", "430", "52", "9"]

Entrada: "sasprs] tore \" re \\ forz "
Salida: nulo, [], [""] o similar

Indique cuántos bytes utilizan sus respuestas y, como siempre, ¡feliz golf!


Pautas para Lisp

Esto es lo que cuenta y no cuenta en los dialectos de Lisp:

;;; Opción 1

(Defun extract-strings (ab); no cuenta
  (cosas) ;;; Todo aquí cuenta
); No cuenta

;;; opcion 2

(defun extracto-cadenas (cadena y aux (inicio 0) (final 0)); no cuenta
  (cosas) ;;; Todo aquí cuenta
); No cuenta
Todas las otras lambdas cuentan totalmente para el conteo de bytes.

Isiah Meadows
fuente
¿No se preguntó esto antes?
Ismael Miguel
1
Sí, pero lo volví a preguntar en Meta e hice ediciones sustanciales antes de publicarlo nuevamente aquí. Debido a esto, no debe clasificarse como un duplicado (el otro relacionado debe cerrarse si no es así).
Isiah Meadows
2
¿No debería publicarse su "golf" como respuesta?
MrWhite
44
Lo siento, pero -1 por no permitir GolfScript. Todos los idiomas deben estar permitidos.
Pomo de la puerta
1
@Doorknob Eso es cierto, pero también entiendo los sentimientos del OP. Las personas deberían tener la oportunidad de competir incluso si no hablan GolfScript, J o APL (y soy culpable de leerlo en estas competiciones). ¿Puedes echar un vistazo a mi propuesta en el hilo al que se vinculó?
Tobia

Respuestas:

10

APL, 13 caracteres

(o 28/30 bytes, lea a continuación)

{⍵⊂⍨⍵∊∊⍕¨⍳10}

Veo que has prohibido GolfScript a tu pregunta. Entiendo su opinión, pero espero que esta comunidad no prohíba eventualmente APL, porque es un lenguaje de programación verdaderamente notable con una larga historia, sin mencionar que es muy divertido codificarlo. Tal vez podría calificarse de manera diferente, si las personas Siento que está compitiendo injustamente. Publicaré mis pensamientos sobre este asunto en ese hilo que has vinculado.

En ese mismo token, siempre agregué una nota al pie de página en mis publicaciones APL, alegando que APL podría puntuarse como 1 char = 1 byte. Mi afirmación se basa en el hecho de que algunas implementaciones APL (en su mayoría comerciales) todavía admiten su propia codificación de un solo byte heredado, con los símbolos APL asignados a los valores superiores de 128 bytes. Pero tal vez esto sea demasiado, en cuyo caso es posible que desee calificar esta entrada como 28 bytes en UTF-16 o 30 bytes en UTF-8.

Explicación

{        ⍳10}  make an array of naturals from 1 to 10
       ⍕¨      convert each number into a string
      ∊        concatenate the strings into one (it doesn't matter that there are two 1s)
    ⍵∊         test which chars from the argument are contained in the digit string
 ⍵⊂⍨           use it to perform a partitioned enclose, which splits the string as needed

Ejemplos

      {⍵⊂⍨⍵∊∊⍕¨⍳10} 'ab5c0x'
 5  0 
      {⍵⊂⍨⍵∊∊⍕¨⍳10}  'z526ks4f.;8]\p'
 526  4  8 

El formato de salida predeterminado para una matriz de cadenas no deja en claro cuántas cadenas hay en la matriz ni cuántos espacios en blanco. Pero una manipulación rápida para agregar citas debería dejarlo lo suficientemente claro:

      {q,⍵,q←'"'}¨ {⍵⊂⍨⍵∊∊⍕¨⍳10} 'ab5c0x'
 "5"  "0" 
      {q,⍵,q←'"'}¨ {⍵⊂⍨⍵∊∊⍕¨⍳10}  'z526ks4f.;8]\p'
 "526"  "4"  "8" 
Tobia
fuente
Con respecto a su comentario, creo que para que otros idiomas compitan de manera justa con los "abreviados", se debe contar cada símbolo en los otros idiomas como un carácter. Por ejemplo, mi solución de Mathematica publicada aquí debe contarse como 7 (más o menos). Creo que diseñar un lenguaje con tokens comprimidos no tiene ningún mérito.
Dr. belisarius
¿Podría proporcionar un volcado hexagonal de su golf? No puedo leer algunos de los personajes.
Isiah Meadows
@impinball ¿Cómo te ayudaría el hexdump? No es como si vieras lo que se está haciendo.
mniip
@impinball el código APL es {omega adjuntar viaje omega epsilon epsilon formato cada iota 10}. Si necesita los valores Unicode, simplemente puede copiarlos y pegarlos en cualquier herramienta en línea , incluso si no puede ver los caracteres (lo cual es extraño, ya que la mayoría de las fuentes Unicode modernas tienen los símbolos APL) En cualquier caso, lo que obtiene es esto {\ u2375 \ u2282 \ u2368 \ u2375 \ u220a \ u220a \ u2355 \ u00a8 \ u237310} (cuidado con el último "10" que no es parte de la secuencia de escape)
Tobia
1
En lugar de ∊⍕¨⍳10, ¿no podrías simplemente usar ⎕D? Esa debería ser la constante '0123456789'. Dyalog APL por lo menos lo admite, y también lo hace NARS2000.
marinus
5

Python 47

Implementación

f=lambda s:"".join([' ',e][e.isdigit()]for e in s).split()

Manifestación

>>> sample=["abc123def456","aitew034snk582:3c","as5493tax54\\[email protected]","sasprs]tore\"re\\forz"]
>>> [f(data) for data in sample]
[['123', '456'], ['034', '582', '3'], ['5493', '54', '430', '52', '9'], []]

Algoritmo

Convierta cada carácter que no sea un dígito en espacio y luego divida la cadena resultante. Un enfoque simple y claro.

Y una solución divertida con itertools (71 caracteres)

f1=lambda s:[''.join(v)for k,v in __import__("itertools").groupby(s,key=str.isdigit)][::2]
Abhijit
fuente
4

Rubí, 70

f=->(s){s.chars.chunk{|c|c.to_i.to_s==c}.select{|e|e[0]}.transpose[1]}

Versión en línea para probar

Dado que la conversión de cualquier carácter que no sea un dígito a un int devuelve 0 en Ruby (con to_i), la conversión de cada carácter a int y de nuevo a carácter es la forma sin expresión regular de verificar un dígito ...

David Herrmann
fuente
También puede hacer un miembro ('0' .. '9'). por cada char, pero lo que hiciste ya es más corto
fgp
Definitivamente tienes razón, debería haber dicho: "una" manera;)
David Herrmann
4

bash, 26 (contenido de la función: 22 + asignación de matriz sobrecarga 4)

Esto no va a superar la otra bashrespuesta , pero es interesante porque podría hacerte tomar dos veces:

f()(echo ${1//+([!0-9])/ })

El uso es:

$ a=(`f "ab5c0x"`); echo ${a[@]}
5 0
$ a=(`f "z526ks4f.;8]\p"`); echo ${a[@]}
526 4 8
$ 

A primera vista rápida, se //+([!0-9])/parece mucho a una sustitución de expresiones regulares, pero no lo es. Es una expansión de parámetros bash , que sigue las reglas de coincidencia de patrones , en lugar de las reglas de expresión regular.

Devolver tipos de matriz de bash verdaderos de las funciones de bash es una molestia, por lo que elegí devolver una lista delimitada por espacios en su lugar, luego convertirla en una matriz en una asignación de matriz fuera de la llamada a la función. Entonces, en aras de la equidad, creo que la (` `)llamada a la función debería incluirse en mi puntaje.

Trauma digital
fuente
3

Mathematica 32

StringCases[#,DigitCharacter..]&

Uso

inps ={"abc123def456", "aitew034snk582:3c", "as5493tax54\\[email protected]", 
        "sasprs]tore\"re\\forz"}  
StringCases[#,DigitCharacter..]&/@inps

{{"123", "456"}, 
 {"034", "582", "3"}, 
 {"5493", "54", "430", "52", "9"}, 
 {}
}

¡El equivalente usando expresiones regulares es mucho más largo !:

StringCases[#, RegularExpression["[0-9]+"]] &
Dr. belisario
fuente
Mathematica apesta en regex.
CalculatorFeline
3

Bash, 21 bytes 17/21 bytes (mejorado por DigitalTrauma )

Construyendo una lista separada por espacios con tr

function split() {
tr -c 0-9 \ <<E
$1
E
}

reemplaza cualquier no dígito por un espacio

Uso

$ for N in $(split 'abc123def456'); do echo $N; done
123
456

Editar

Como señalan los comentarios a continuación, el código se puede reducir a 17 bytes:

function split() (tr -c 0-9 \ <<<$1)

y como el resultado no es estrictamente hablando una matriz Bash, el uso debe ser

a=(`split "abc123def456"`); echo ${a[@]}

y el extra (``)debe contarse

Coaumdio
fuente
1
Gah you beat me to it! But why not use a here-string instead of a here-document? Also you can save a newline at the end of the function content you use (blah) instead of {blah;}: split()(tr -c 0-9 \ <<<$1). That way your function body is only 17 chars.
Digital Trauma
1
Your function returns a "space-separated list" instead of an array. Certainly returning true arrays from bash function is awkward, but you could at least assign the result of your function to an array in your usage: a=($(split "12 3a bc123")); echo ${a[@]}. It could be argued that "($())" be counted in your score
Digital Trauma
Before exploring the tr approach, I tried doing this with a parameter expansion. tr is definitely the better approach for golfing purposes.
Digital Trauma
Have you tried surrounding the tr with the expansion operator? It would come out to something like ($(tr...)), and where the function declaration doesn't count, the outer parentheses wouldn't count against you. It would only be the command substitution part.
Isiah Meadows
I don't see how this should be working, but I'm not fluent in Bash arrays though. Anyway, the (``) construct is 1-char better than the ($()) one and shall be prefered.
Coaumdio
2

Smalltalk (Smalltalk/X), 81

f := [:s|s asCollectionOfSubCollectionsSeparatedByAnyForWhich:[:ch|ch isDigit not]]

f value:'abc123def456' -> OrderedCollection('123' '456')

f value:'aitew034snk582:3c' -> OrderedCollection('034' '582' '3')

f value:'as5493tax54\[email protected]' -> OrderedCollection('5493' '54' '430' '52' '9')

f value:'sasprs]tore\"re\forz' -> OrderedCollection()

sigh - Smalltalk has a tendency to use veeeery long function names...

blabla999
fuente
2
Is that a function name? o__O
Tobia
@tobia Apparently...
Isiah Meadows
asCollectionOfSubCollectionsSeparatedByAnyForWhich ಠ_ಠ This name is too long
TuxCrafting
1

R, 81

f=function(x){
s=strsplit(x,"",T)[[1]]
i=s%in%0:9
split(s,c(0,cumsum(!!diff(i))))[c(i[1],!i[1])]
}

The function accepts a string and returns a list of strings.

Examples:

> f("abc123def456")
$`1`
[1] "1" "2" "3"

$`3`
[1] "4" "5" "6"

-

> f("aitew034snk582:3c")
$`1`
[1] "0" "3" "4"

$`3`
[1] "5" "8" "2"

$`5`
[1] "3"

-

> f("as5493tax54\\[email protected]")
$`1`
[1] "5" "4" "9" "3"

$`3`
[1] "5" "4"

$`5`
[1] "4" "3" "0"

$`7`
[1] "5" "2"

$`9`
[1] "9"

-

> f("sasprs]tore\"re\\forz")
$<NA>
NULL

Note: $x is the name of the list element.

Sven Hohenstein
fuente
1

Perl, 53

Edit: on no matches, sub now returns list with empty string (instead of empty list) as required.

It also avoids splitting on single space character, as it triggers 'split on any white-space' behavior, which probably violates the rules. I could use / / delimiter, which would split on single space, but paradoxically it would look like using regexp pattern. I could use unpack at the cost of some extra characters and so get rid of split controversy altogether, but I think that, what I finish with, splitting on a literal character (other than space) is OK.

sub f{shift if(@_=split a,pop=~y/0-9/a/csr)[0]eq''and$#_;@_}

And, no, Perl's transliteration operator doesn't do regular expressions. I can unroll 0-9 range to 0123456789 if that's the problem.

user2846289
fuente
As long as it doesn't use regular expressions, it's valid.
Isiah Meadows
My Perl is not so strong. If I understand the code, you are replacing non-digits with a specific non-digit, then splitting on that chosen non-digit, then filtering out empty strings. Is this a correct reading?
Tim Seguine
1
@TimSeguine: Not exactly. Non-digits are replaced and squashed to a single character, splitting on which produces empty string if that delimiter happens to be at the beginning. It is then shifted away if list contains other entries.
user2846289
Enpty list is okay.
Isiah Meadows
1

C, 68 bytes (only the function's body)

void split (char *s, char **a) {
int c=1;for(;*s;s++)if(isdigit(*s))c?*a++=s:0,c=0;else*s=0,c=1;*a=0;
}

The first argument is the input string, the second one is the output array, which is a NULL-terminated string array. Sufficient memory must be reserved for a before calling the function (worst case: sizeof(char*)*((strlen(s)+1)/2)).

The input string is modified by the function (every non-digit character is replaced by '\0')

Usage example

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

void split (char *s, char **a) {
int c=1;for(;*s;s++)if(isdigit(*s))c?*a++=s:0,c=0;else*s=0,c=1;*a=0;
}   

void dump(char **t) {
    printf("[ ");for(;*t;t++)printf("%s ", *t);printf("]\n");
}   

int main() {
    char **r = malloc(1024);
    char test1[] = "abc123def456";
    char test2[] = "aitew034snk582:3c";
    char test3[] = "as5493tax54\\[email protected]";
    char test4[] = "sasprs]tore\"re\\forz";
    split(test1,r); 
    dump(r);
    split(test2,r); 
    dump(r);
    split(test3,r); 
    dump(r);
    split(test4,r); 
    dump(r);
    return 0;
}

Output

[ 123 456 ]
[ 034 582 3 ]
[ 5493 54 430 52 9 ]
[ ]

Un-golfed version:

void split (char *s, char **a) {
    int c=1; // boolean: the latest examined character is not a digit
    for(;*s;s++) {
        if(isdigit(*s)) {
            if(c) *a++ = s; // stores the address of the beginning of a digit sequence
            c=0;
        } else {
            *s=0; // NULL-terminate the digit sequence
            c=1;
        }   
    }   
    *a = 0; // NULL-terminate the result array
} 
Coaumdio
fuente
1

VBScript, 190 (164 without function declaration)

Function f(i)
For x=1 To Len(i)
c=Mid(i,x,1)
If Not IsNumeric(c) Then
Mid(i,x,1)=" "
End If
Next
Do
l=Len(i)
i=Replace(i,"  "," ")
l=l-Len(i)
Loop Until l=0
f=Split(Trim(i)," ")
End Function

While not competitive at all, I'm surprised that VBScript comes out this short on this given how verbose it is (13 bytes for the CRs alone). It loops through the string, replacing any non-numeric characters with spaces, then reduces all the whitespace to single spaces, and then uses a space delimiter to divide it.

Test cases

Input: "ab5c0x"
Output: 5,0

Input: "z526ks4f.;8]\p"
Output: 526,4,8
Comintern
fuente
DOS line endings count as one character as far as I've read on meta.
Isiah Meadows
I suggested an edit for you.
Isiah Meadows
The count already assumes Linux style 1 byte line endings. I get 190 characters by my count (just verified again).
Comintern
Ok. I must have miscounted.
Isiah Meadows
1

Common Lisp (1 according to the letter; ≈173 according to the spirit)

Here's a readable version. The byte count is fairly high because of the long names in things like digit-char-p and position-if and vector-push-extend.

(defun extract-numeric-substrings (string &aux (start 0) (end 0) (result (make-array 0 :adjustable t :fill-pointer 0)))
  (loop 
     (unless (and end (setq start (position-if #'digit-char-p string :start end)))
       (return result))
     (setq end (position-if (complement #'digit-char-p) string :start (1+ start)))
     (vector-push-extend (subseq string start end) result)))
(extract-numeric-substrings "abc123def456")
#("123" "456")

(extract-numeric-substrings "aitew034snk582:3c")
#("034" "582" "3")

(extract-numeric-substrings "as5493tax54\\[email protected]")
#("5493" "54" "430" "52" "9")

(extract-numeric-substrings "sasprs]tore\"re\\forz")
#()

The concept of "function declaration" is sort of vague. Here's a version that only has one byte (the character x in the function body); everything else is bundled in to the auxiliary variables of the function's lamba list (part of the function's declaration):

(defun extract-numeric-substrings (string 
                                   &aux (start 0) (end 0) 
                                   (result (make-array 0 :adjustable t :fill-pointer 0))
                                   (x (loop 
                                         (unless (and end (setq start (position-if #'digit-char-p string :start end)))
                                           (return result))
                                         (setq end (position-if (complement #'digit-char-p) string :start (1+ start)))
                                         (vector-push-extend (subseq string start end) result))))
  x)

The actual byte count will depend on how many of auxiliary declarations would have to be moved into the body for this to be deemed acceptable. Some local function renaming would help, too (e.g., shorten position-if since it appears twice, use single letter variables, etc.).

This rendering of the program has 220 characters:

(LOOP(UNLESS(AND END(SETQ START(POSITION-IF #'DIGIT-CHAR-P STRING :START END)))(RETURN RESULT))(SETQ END(POSITION-IF(COMPLEMENT #'DIGIT-CHAR-P)STRING :START(1+ START)))(VECTOR-PUSH-EXTEND(SUBSEQ STRING START END)RESULT))

If nothing else, this should promote Common Lisp's &aux variables.

This can be written more concisely with loop, of course:

(defun extract-numeric-substrings (s &aux (b 0) (e 0) (r (make-array 0 :fill-pointer 0)))
  (loop 
     with d = #'digit-char-p 
     while (and e (setq b (position-if d s :start e)))
     finally (return r)
     do 
       (setq e (position-if-not d s :start (1+ b)))
       (vector-push-extend (subseq s b e) r)))

The loop form, with extra space removed, has 173 characters:

(LOOP WITH D = #'DIGIT-CHAR-P WHILE(AND E(SETQ B(POSITION-IF D S :START E)))FINALLY(RETURN R)DO(SETQ E(POSITION-IF-NOT D S :START(1+ B)))(VECTOR-PUSH-EXTEND(SUBSEQ S B E)R))
Joshua Taylor
fuente
I would count starting from (result on to the final parenthesis to be the body. The part that defines the name and parameters are the declaration.
Isiah Meadows
Please refer to rule 2 on my amended rules to see what I'm really talking about in a function declaration (basically, declare function name, parameters, and if syntactically required, which is rare among interpreted languages, the return type).
Isiah Meadows
@impinball Yeah, the "1" count is sort of a joke, but the important part here is that result is declared as a parameter here; it just has a very non-trivial initialization form. It's the same thing, in principle, as an optional argument with a default value that's computed by some complex expression. (In simpler cases, it's easy to imagine something like char* substring( char *str, int begin, int end(0) ) in some language with a C-like syntax to specify that end is optional and that if it's not provided, then its value is 0. I'm just highlighting the fact that some of these terms
Joshua Taylor
@impinball aren't quite concrete and language agnostic enough to prevent some trollish byte counts. :)
Joshua Taylor
The first part that isn't specifying parameters is where I would stat counting (e.g. (defun fn (string &aux (start 0) (end 0) wouldn't count, but everything remaining in the lambda would).
Isiah Meadows
0

JavaScript, 240 bytes

And for those of you who are curious, here's my probably huge golf:

function split(a) { // begin function
function f(c){for(var a=-1,d=9;d--;){var e=c.indexOf(d+"");0
>e||e<a&&(a=e)}return 0<a?a:null}var c=f(a);if(null==c)retur
n null;var d=[];for(i=0;;){a=a.substring(c);d[i]||(d[i]="");
c=f(a);if(null==c)break;d[i]+=a.charAt(c);0<c&&i++}return d;
} // end function

Above in pretty print:

function split(a) {
    function f(c) {
        for (var a = -1, d = 9;d--;) {
            var e = c.indexOf(d + "");
            0 > e || e < a && (a = e);
        }
        return 0 < a ? a : null;
    }
    var c = f(a);
    if (null == c) return null;
    var d = [];
    for (i = 0;;) {
        a = a.substring(c);
        d[i] || (d[i] = "");
        c = f(a);
        if (null == c) break;
        d[i] += a.charAt(c);
        0 < c && i++;
    }
    return d;
}

Above in normal descriptive code

function split(a) {
    function findLoop(string) {
        var lowest = -1;
        var i = 9;
        while (i--) {
            var index = string.indexOf(i + '');
            if (index < 0) continue;
            if (index < lowest) lowest = index;
        }
        return (lowest > 0) ? lowest : null;
    }
    var index = findLoop(a);
    if (index == null) return null;
    var ret = [];
    i = 0;
    for ( ; ; ) {
        a = a.substring(index);
        if (!ret[i]) ret[i] = '';
        index = findLoop(a);
        if (index == null) break;
        ret[i] += a.charAt(index);
        if (index > 0) i++;
    }
    return ret;
}
Isiah Meadows
fuente
0

PHP 134

function f($a){
$i=0;while($i<strlen($a)){!is_numeric($a[$i])&&$a[$i]='-';$i++;}return array_filter(explode('-',$a),function($v){return!empty($v);});
}
Einacio
fuente
You can shorten it by leaving out the callback at array_filter. This will automatically remove all entries which are false when they're casted to booleans.
kelunik
@kelunik that would filter out 0s as well
Einacio
0

C, 158

#define p printf
char s[100],c;int z,i;int main(){while(c=getchar())s[z++]=(c>47&&c<58)*c;p("[");for(;i<z;i++)if(s[i]){p("\"");while(s[i])p("%c",s[i++]);p("\",");}p("]");}

Since C doesnt have array print functions built-in I had to do that work on my own so I apologive that there is a final comma in every output. Essentially what that code does is it reads the string if it is not a digit it replaces it with '\0' and then I just loop through the code and print out all of the chains of digits.(EOF=0)

Input: ab5c0x
Output: ["5","0",]

Input: z526ks4f.;8]\p
Output: ["526","4","8",]

ASKASK
fuente
According to the question's rules (rule 2), you only have to count the characters in the function body. So your solution would actually be less than 170 bytes. I'm not sure if the count includes variable prototypes outside the function body, though.
grovesNL
I will amend the rules on this: #defines, variable declarations, etc. will count, but the function declaration will not.
Isiah Meadows
Also, last time I checked, there was a type in C notated as char[][] which is legal. If you return as that (or char**), you will be fine.
Isiah Meadows
It doesn't have To be text output? I though the program was supposed to output the array in a string format
ASKASK
0

C#, 98

static string[] SplitAtNonDigits(string s)
{
    return new string(s.Select(c=>47<c&c<58?c:',').ToArray()).Split(new[]{','},(StringSplitOptions)1);
}

First, this uses the LINQ .Select() extension method to turn all non-digits into commas. string.Replace() would be preferable, since it returns a string rather than a IEnumerable<char>, but string.Replace() can only take a single char or string and can't make use of a predicate like char.IsDigit() or 47<c&c<58.

As mentioned, .Select() applied to a string returns an IEnumerable<char>, so we need to turn it back into a string by turning it into an array and passing the array into the string constructor.

Finally, we split the string at commas using string.Split(). (StringSplitOptions)1 is a shorter way of saying StringSplitOptions.RemoveEmptyEntries, which will automatically takes care of multiple consecutive commas and commas at the start/end of the string.

BenM
fuente
1
Instead of char.IsDigit(c), you can use '/'<c&&c<':'
grovesNL
1
Good point...or even better, 47<c&&c<58. (Frankly, I'm surprised it works with numbers, but apparently it does).
BenM
1
And I can save an extra valuable character by using a single '&' instead of a double '&&'. In C#, this still logical AND when both operands are booleans -- it only does a bitwise AND when they're integers.
BenM
Nice one. I didn't know it was able to do that.
grovesNL
A slightly shorter variant is to split on white space instead of ,, and then manually remove the empty items return new string(s.Select(c=>47<c&c<58?c:' ').ToArray()).Split().Where(a=>a!="").ToArray();
VisualMelon
0

JS/Node : 168 162 147 138 Chars

function n(s){
var r=[];s.split('').reduce(function(p,c){if(!isNaN(parseInt(c))){if(p)r.push([]);r[r.length-1].push(c);return 0;}return 1;},1);return r;
}

Beautified version:

function n(s) {
  var r = [];
  s.split('').reduce(function (p, c) {
    if (!isNaN(parseInt(c))) {
      if (p) {
        r.push([]);
      }
      r[r.length - 1].push(c);
      return 0;
    }
    return 1;
  }, 1);
  return r;
}
palanik
fuente
This question only wants the array returned, so you can remove console.log(r) and some other things
Not that Charles
The function declaration doesn't count toward the score (reason is to help level the playing field)
Isiah Meadows
Ok. Adjusted the score as per @impinball's comment. (Actually there are two functions declared here. Char count includes the anonymous function)
palanik
It should. I updated the rules to help explain it better.
Isiah Meadows
Meanwhile, came up with something better...
palanik
0

Ruby, 24

f=->s{s.tr("
-/:-~",' ').split}

Defines digits using negative space within the printable ascii range.

histocrat
fuente
Function declaration doesn't count.
Isiah Meadows
0

php, 204

function s($x){$a=str_split($x);$c=-1;$o=array();
for($i= 0;$i<count($a);$i++){if(ord($a[$i])>=48&&ord($a[$i])<=57)
{$c++;$o[$c]=array();}while(ord($a[$i])>=48&&ord($a[$i])<=57)
{array_push($o[$c],$a[$i]);$i++;}}return $o;}

Descriptive Code:

function splitdigits($input){

    $arr = str_split($input);
    $count = -1;
    $output = array();
    for($i = 0; $i < count($arr); $i++){


    if(ord($arr[$i]) >= 48 && ord($arr[$i]) <= 57){
        $count++;
        $output[$count] = array();
    }

    while(ord($arr[$i]) >= 48 && ord($arr[$i]) <= 57){
        array_push($output[$count], $arr[$i]);
        $i++;
    } 

}

return $output;
}

This is pretty long code and I'm sure there will be a much shorter php version for this code golf. This is what I could come up with in php.

palerdot
fuente
there are some improvements: you can replace array() with [], array_push($output[$count], $arr[$i]); with $output[$count][]=$arr[$i];, and the ord() checks with is_numeric(). and you don't even need to split the string to iterate over its characters. also, only the inner code of the function counts, so as it is you char count is 204.
Einacio
The function declaration doesn't count. Refer to rule 2 as a guide on what counts and what doesn't.
Isiah Meadows
0

Python

def find_digits(_input_):
    a,b = [], ""
    for i in list(_input_):
        if i.isdigit(): b += i
        else:
            if b != "": a.append(b)
            b = ""
    if b != "": a.append(b)
    return a
I left StackExchange
fuente
0

Python 104 83

def f(s, o=[], c=""):
    for i in s:
        try:int(i);c+=i
        except:o+=[c];c=""
    return [i for i in o+[c] if i]

@Abhijit answer is far clever, this is just a "minified" version of what i had in mind.

assert f("abc123def456") == ["123", "456"]
assert f("aitew034snk582:3c") == ["034", "582", "3"]
assert f("as5493tax54\\[email protected]") == ["5493", "54", "430", "52", "9"]
assert f("sasprs]tore\"re\\forz") == []

This yields no output, so the code is working, if ran one by one, as some variables are defined at the declaration.

gcq
fuente
You don't have to count the function declaration, if you did. Just as a heads up
Isiah Meadows
0

PHP 98 89

As in DigitalTrauma's bash answer, this doesn't use a regex.

function f($x) {
// Only the following line counts:
for($h=$i=0;sscanf(substr("a$x",$h+=$i),"%[^0-9]%[0-9]%n",$j,$s,$i)>1;)$a[]=$s;return@$a;
}

Test cases:

php > echo json_encode(f("abc123def456")), "\n";
["123","456"]
php > echo json_encode(f("aitew034snk582:3c")), "\n";
["034","582","3"]
php > echo json_encode(f("as5493tax54\\[email protected]")), "\n";
["5493","54","430","52","9"]
php > echo json_encode(f("sasprs]tore\"re\\forz")), "\n";
null
PleaseStand
fuente
0

Haskell 31

{-# LANGUAGE OverloadedStrings #-}
import Data.Char (isDigit)
import Data.Text (split)

f=filter(/="").split(not.isDigit)

It splits the string on all non-numeric characters and removes the empty strings generated by consecutive delimiters.

lortabac
fuente
0

VBA 210, 181 without function declaration

Function t(s)
Dim o()
For Each c In Split(StrConv(s,64),Chr(0))
d=IsNumeric(c)
If b And d Then
n=n&c
ElseIf d Then:ReDim Preserve o(l):b=1:n=c
ElseIf b Then:b=0:o(l)=n:l=l+1:End If:Next:t=o
End Function
Gaffi
fuente
0

Rebol (66 chars)

remove-each n s: split s complement charset"0123456789"[empty? n]s

Ungolfed and wrapped in function declaration:

f: func [s] [
    remove-each n s: split s complement charset "0123456789" [empty? n]
    s
]

Example code in Rebol console:

>> f "abc123def456"
== ["123" "456"]

>> f "aitew035snk582:3c"
== ["035" "582" "3"]

>> f "as5493tax54\\[email protected]"
== ["5493" "54" "430" "52" "9"]

>> f {sasprs]torer"re\\forz}
== []
draegtun
fuente
0

JavaScript, 104 97 89

Golfed:

Edit: When the loops walks off the end of the array, c is undefined, which is falsy and terminates the loop.

2/27: Using ?: saves the wordiness of if/else.

function nums(s) {
s+=l='length';r=[''];for(k=i=0;c=s[i];i++)r[k]+=+c+1?c:r[k+=!!r[k][l]]='';
r[l]--;return r
}

The carriage return in the body is for readability and is not part of the solution.

Ungolfed:

The idea is to append each character to the last entry in the array if it is a digit and to ensure the last array entry is a string otherwise.

function nums(s) {
    var i, e, r, c, k;
    k = 0;
    s+='x'; // ensure the input does not end with a digit
    r=[''];
    for (i=0;i<s.length;i++) {
        c=s[i];
        if (+c+1) { // if the current character is a digit, append it to the last entry
            r[k] += c;
        }
        else { // otherwise, add a new entry if the last entry is not blank
            k+=!!r[k].length;
            r[k] = '';
        }
    }
    r.length--; // strip the last entry, known to be blank
    return r;
}
DocMax
fuente
0

Javascript, 72

function f(a){
 a+=".",b="",c=[];for(i in a)b=+a[i]+1?b+a[i]:b?(c.push(b),""):b;return c
}

Ungolfed

a+=".",b="",c=[];        //add '.' to input so we dont have to check if it ends in a digit
for(i in a)
    b=+a[i]+1?           //check if digit, add to string if it is
        b+a[i]:         
    b?                   //if it wasnt a digit and b contains digits push it
        (c.push(b),""):  //into the array c and clear b
    b;                   //else give me b back
return c

Sample input/output

console.log(f("abc123def456"));
console.log(f("aitew034snk582:3c"));
console.log(f("as5493tax54\\[email protected]"));
console.log(f("sasprs]tore\"re\\forz"));

["123", "456"]
["034", "582", "3"]
["5493", "54", "430", "52", "9"]
[] 

JSFiddle

Danny
fuente
1
I like it! Much simpler than my own. You can drop another 8 characters by replacing if(+a[i]+1)b+=a[i];else if(b)c.push(b),b="" with b=+a[i]+1?b+a[i]:b?(c.push(b),""):b.
DocMax
@DocMax thx, I edited to include your suggestion :). That (c.push(b),"") seemed clever, never seen that.
Danny
I had forgotten about it until I saw it used extensively earlier today in codegolf.stackexchange.com/questions/22268#22279
DocMax
That's not valid, ' ' is mistaken for 0 and it's a javascript quirk difficult to manage. Try '12 34 56'
edc65
0

R 52

This function splits strings by character class (this is not regex! :)) class is N - numeric characters and P{N} means negation of this class. o=T means omit empty substrings.

x
## [1] "wNEKbS0q7hAXRVCF6I4S" "DpqW50YfaDMURB8micYd" "gwSuYstMGi8H7gDAoHJu"
require(stringi)
stri_split_charclass(x,"\\P{N}",o=T)
## [[1]]
## [1] "0" "7" "6" "4"

## [[2]]
## [1] "50" "8" 

## [[3]]
## [1] "8" "7"
bartektartanus
fuente
0

PHP 99

<?php

$a = function($s) {
foreach(str_split($s)as$c)$b[]=is_numeric($c)?$c:".";return array_filter(explode('.',implode($b)));
};

var_dump($a("abc123def456"));
var_dump($a("aitew034snk582:3c"));
var_dump($a("as5493tax54\\[email protected]"));
var_dump($a("sasprs]tore\"re\\forz"));


Output

array(2) {
  [3]=>
  string(3) "123"
  [6]=>
  string(3) "456"
}
array(3) {
  [5]=>
  string(3) "034"
  [8]=>
  string(3) "582"
  [9]=>
  string(1) "3"
}
array(5) {
  [2]=>
  string(4) "5493"
  [5]=>
  string(2) "54"
  [6]=>
  string(3) "430"
  [7]=>
  string(2) "52"
  [9]=>
  string(1) "9"
}
array(0) {
}
kelunik
fuente
0

JavaScript 88

88 chars when not counting function n(x){}

function n(x){
y=[],i=0,z=t=''
while(z=x[i++])t=!isNaN(z)?t+z:t&&y.push(t)?'':t
if(t)y.push(t)
return y
}
wolfhammer
fuente